About
When working with large data volumes, it is sometimes unnecessary to have a database at all. It can be more perfoxrmant to read binary data directly from a file and then analyze it in Speedment. One such format for storing binary data is Avro.
Generating Code
Speedment uses the metadata in a database as the domain model when generating code. The metadata is stored in a speedment.json
-file, and unless you call mvn speedment:reload
, it will only connect to the database if that file doesn’t exist. When working with Avro-files, we use this to our advantage. Instead of using the database metadata to generate the speedment.json
-file, we can use a Maven plugin called speedment-avro-maven-plugin
to create it from a number of Avro-schemas. We can then run mvn speedment:generate
as usual to generate Java code.
<plugin>
<groupId>com.speedment.enterprise.plugins</groupId>
<artifactId>speedment-avro-maven-plugin</artifactId>
<version>${speedment.enterprise.version}</version>
<configuration>
<projectName>sakila</projectName>
<deleteExisting>true</deleteExisting>
<enableEnumPlugin>true</enableEnumPlugin>
<overridesFile>src/main/json/speedment_override.json</overridesFile>
<schemas>
<schema>
<tableId>film</tableId>
<schemaFile>src/main/avro/Film.avsc</schemaFile>
<overridesFile>src/main/json/speedment_film_override.json</overridesFile>
</schema>
</schemas>
</configuration>
<!-- Execute automatically -->
<executions>
<execution>
<id>Reload From Avro File</id>
<phase>generate-sources</phase>
<goals>
<goal>avro</goal>
</goals>
</execution>
</executions>
</plugin>
In the example above, the plugin is configured to generate a project called “sakila” with one table called “Film” based on the Avro schema src/main/avro/Film.avsc
. There are two so called <overridesFile>
specified; one for the project as a whole and one that is local only to the Film
table. These are optional .json
-files that you can create manually that will override any settings generated by the plugin. This is useful for configuring things that the Avro-plugin can’t yet figure out automatically, like the package structure of your project or which type mappers to use.
Override Generated speedment.json
Here is an example of how src/main/json/speedment_override.json
could look:
{
"config" : {
"id" : "sakila",
"name" : "sakila",
"companyName" : "speedment",
"packageName" : "com.speedment.example.sakila",
"appId" : "b9c8eb47-3810-4f45-972b-f2cf64f43d71",
"dbmses" : [
{
"id" : "sakila",
"alias" : "my_dbms",
"schemas" : [
{
"id" : "sakila"
"alias" : "my_schema"
}
]
}
]
}
}
We set a custom company
and packageName
on the project-level and also specify a custom alias
for both the generated dbms
and the schema
. Note that the id
of both project
, dbms
and schema
will be the value you specified as projectName
in the maven plugin configuration.
Override Table Specific Settings
Here is an example of how the table-specific src/main/json/speedment_film_override.json
could look:
{
"config" : {
"id" : "sakila",
"dbmses" : [
{
"id" : "sakila",
"schemas" : [
{
"id" : "sakila",
"tables" : [
{
"id" : "film",
"avroDataFile" : "Film.avro",
"columns" : [
{
"id" : "film_id",
"alias" : "id"
},
{
"id" : "rating",
"enabled" : false
}
]
}
]
}
]
}
]
}
}
We specify the location of the binary data file for the Film
table, and also change the default settings for two of the columns. The first one (film_id
) we give an alias to so that it will be called just id
in the code. The second one (rating
) we disable since we don’t want it to be generated at all.
Automating Build Process
The next step is to invoke mvn speedment:generate
automatically after the speedment-avro-maven-plugin
has been invoked. This can be done like this:
<plugin>
<groupId>com.speedment.enterprise</groupId>
<artifactId>speedment-enterprise-maven-plugin</artifactId>
<version>${speedment.enterprise.version}</version>
<configuration>
<components>
<component>com.speedment.enterprise.datastore.tool.DataStoreToolBundle</component>
<component>com.speedment.enterprise.plugins.avro.generator.AvroGeneratorBundle</component>
</components>
</configuration>
<executions>
<execution>
<id>Generate Speedment Sources</id>
<phase>generate-sources</phase>
<goals>
<goal>generate</goal>
</goals>
</execution>
</executions>
</plugin>
Finally, you need to make sure that you have the avro-dependencies on the classpath. To fix this, simply add the avro-runtime
-dependency to the pom.xml
-file.
<dependency>
<groupId>com.speedment.enterprise.plugins</groupId>
<artifactId>avro-runtime</artifactId>
</dependency>
Using Avro Files at Runtime
To load data from the .avro
-files instead of a database, you need to add the AvroRuntimeBundle
to the Speedment application builder and append withSkipCheckDatabaseConnectivity()
.
new AmericanApplicationBuilder()
.withSkipCheckDatabaseConnectivity()
.withBundle(AvroRuntimeBundle.class)
.withBundle(DataStoreBundle.class)
.build();
Note that the order is important! The AvroRuntimeBundle
needs to come before the DataStoreBundle
since both specify a custom StreamSupplierComponent
.
Customizing Generated Code
When the code is generated, it will look a bit different than if a regular SQL database was used. First, you no longer have any SqlAdapter.java
-files, but instead a number of AvroAdapter.java
-files. These handle the deserialization from Avro to Speedment.
To change the location of the data file (the one that ends with .avro
) at runtime, you can override the dataFile
-method in XXXSqlAdapter.java
. For an example, in the Sakila demo, I could override the method like this:
public class FilmAvroAdapter extends GeneratedFilmAvroAdapter {
@Config(name = "film.data.file", value = "Film.avro")
private String filmDataFile;
@Override
protected Path dataFile(ProjectComponent projects) {
return Paths.get(filmDataFile);
}
}
I can now change the location of my data file without recompiling by simply creating a settings.properties
-file in the root of my project with the line:
film.data.file = another_dir/another_film.avro
And it will be used instead.
Questions and Discussion
If you have any question, don’t hesitate to reach out to the Speedment developers on Gitter.