Templates are pre-built projects that can be used as example datasets for reference, or to be extended and modified. This guide will discuss the pre-generated project "Students", which can be found in the templates section on the initial DataBake screen.
To the right of the project name "Students", you will see four tabs titled "Columns", "Insights", "Preview" and "Export" respectively. These tabs will lead you through the process of using DataBake.
1. Define criteria and insights
DataBake enables users to work backwards by starting with defining the look of the final dataset and insights. In this example, the desired criteria is to generate a dummy dataset which shows the relationship between attendance and exam results.
After opening the "Students" template, you will see four pre-built "Columns'"(data types) in the main section of the page. DataBake offers users a wide range of pre-defined columns, ranging from demographics, financial, spending behaviour data, and more. The columns used in this template, however, are an example of using the "Custom" columns feature.
A diagram of the pre-built relationships between the columns are shown on the left hand side (as below).
These relationships between data types have been built using the "Insights" tab. They can be altered to change the nature of the insight, for example if you wanted to increase the correlation between test scores and attendance.
Pre-built insight for "TestScore"
2. Preview cooked up datasets
The preview tab provides a snapshot into the overall nature of the dataset before generation. This allows for further manipulation if the resulting data does not yet match the initial defined criteria.
When you have multiple distributions in your dataset, scatter plots of each column will become available, allowing you to easily visualise the relationships between columns. To allow flexibility in this view, you can deselect columns you are not interested in seeing as well as colour coding the points by categorical data (such as gender).
DataBake scatter plot of "Students" template
3. Export into spreadsheets or databases
Lastly, the export tab enables the generation of up to 100,000 rows of realistic data as a csv file. This data can be imported into spreadsheets or databases and visualised (see below). This can be used in product dev, demos, testing and academia.
Visualisation of 'Students' template
Grab a slice of this pre-generated template by selecting "Students", or
start baking your very own personalised dataset now.