Creating WDC connectors for FastAPI using Pydantic model definitions

One of the downsides I have with the free version of Tableau is that I cannot generate direct connections to my db's. Generally in Qlikview I would just go ahead and write queries to be loaded on refresh.
However there is an option, to address it, even if a bit more convoluted than just typing in SELECT * FROM Table.
We can use Tableau's Web Data Connectors. This consist in an html site loading all the libraries and a button, and a js script presenting the schema of the table and the GET call to the API.

This is just a quick and dirty web data connector to present the data to
Tableau. This currently this is only set for Strava Activities.

Since for this specific project I was already running Vue, the WDC connector can just be stored in the client\public folder and it will be served properly for Tableau Desktop to use as a data source. We could also autogenerate a JSON on Airflow periodically.

Setting up the HTML

We are going to be keeping the HTML short and sweet. All we really need is to generate a landing page and load all our external dependencies.

<html>

<head>
    <title>Strava Activities</title>
    <meta http-equiv="Cache-Control" content="no-store" />

    <link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" rel="stylesheet"
        integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" crossorigin="anonymous">
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js" type="text/javascript"></script>
    <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"
        integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS"
        crossorigin="anonymous"></script>

    <script src="https://connectors.tableau.com/libs/tableauwdc-2.3.latest.js" type="text/javascript"></script>
</head>

<body>
    <div class="container container-table">
        <div class="row vertical-center-row">
            <div class="text-center col-md-4 col-md-offset-4">
                <button type="button" id="submitButton" class="btn btn-success" style="margin: 10px;">Get Strava
                    Data!</button>
            </div>
        </div>
    </div>
</body>

The important dependencies are jquery and connectors.tableau. Really all our code is doing is submitting a form on press of the button. There are additional ways in which we can provide entry for additional filters, but since this is only for my individual data I'm not too worried about it.

Setting up the JavaScript

Now this is were the meat of the connector is located. A set of great examples can be obtained at the Web Data Connector website.

The script requires two things:

  • Schema
  • Api Call

For the schema we already have a pydantic model define:

class StravaActivityCreate(BaseModel):
    """
    Pydantic class used for data validation.
    Used to validate SQL inserts into Strava Table

    Args:
        BaseModel (BaseModel): Standard pydantic Base Model
    """
    name: str
    type: str
    start_date: datetime
    distance: float
    moving_time: int
    average_speed: Optional[int] = None
    max_speed: Optional[float] = None
    average_cadence: Optional[float] = None
    average_heartrate: Optional[float] = None
    weighted_average_watts: Optional[float] = None
    kilojoules: Optional[float] = None

So to translate that into our JS definition is pretty straight forward. We are just replacing the python data types with the ones defined by the Tableau Data Enum. We define them as the columns of the schema.

            myConnector.getSchema = function (schemaCallback) {
                var cols = [{
                    id: "name",
                    dataType: tableau.dataTypeEnum.string
                }, {
                    id: "type",
                    dataType: tableau.dataTypeEnum.string
                }, {
                    id: "start_date",
                    dataType: tableau.dataTypeEnum.datetime
                }, {
                    id: "distance",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "moving_time",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "average_speed",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "max_speed",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "average_cadence",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "average_heartrate",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "weighted_average_watts",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "kilojoules",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "id",
                    dataType: tableau.dataTypeEnum.int
                }];

                var tableSchema = {
                    id: "strava_activity",
                    alias: "Personal Strava Activity Data",
                    columns: cols
                };

                schemaCallback([tableSchema]);
            };

Our actual schema definition is formed with the snippet at the end:

                var tableSchema = {
                    id: "strava_activity",
                    alias: "Personal Strava Activity Data",
                    columns: cols
                };

That gets set as the callback for out getSchema on the connector. Now we need to provide the connector a way to actually get the data from our API. This requires setting the getData method of our myConnector object. This will be a basic GET call using getJSON.

            myConnector.getData = function (table, doneCallback) {
                $.getJSON("http://rasp-srv:8000/strava", function (resp) {
                    var feat = resp,
                        tableData = [];

                    // Iterate over the JSON object
                    for (var i = 0, len = feat.length; i < len; i++) {
                        tableData.push({
                            "name": feat[i].name,
                            "type": feat[i].type,
                            "start_date": feat[i].start_date,
                            "distance": feat[i].distance,
                            "moving_time": feat[i].moving_time,
                            "average_speed": feat[i].average_speed,
                            "max_speed": feat[i].max_speed,
                            "average_cadence": feat[i].average_cadence,
                            "average_heartrate": feat[i].average_cadence,
                            "weighted_average_watts": feat[i].weighted_average_watts,
                            "kilojoules": feat[i].kilojoules,
                            "id": feat[i].id
                        });
                    }

                    table.appendRows(tableData);
                    doneCallback();
                });
            };

            tableau.registerConnector(myConnector);

We define each individual value in our dictionary and assign the corresponding value from our feat JSON response. Finally we need to append the result of the call to our previously defined table object.

Once that is done we register our component with tableau.registerConnector. Finally we create the event listener for when the user presses the button on our landing page.

            $(document).ready(function () {
                $("#submitButton").click(function () {
                    tableau.connectionName = "Strava Activity Feed"; // This will be the data source name in Tableau
                    tableau.submit(); // This sends the connector object to Tableau
                });
            }

So overall our connector will look like this:

<html>

<head>
    <title>Strava Activities</title>
    <meta http-equiv="Cache-Control" content="no-store" />

    <link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" rel="stylesheet"
        integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" crossorigin="anonymous">
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js" type="text/javascript"></script>
    <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"
        integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS"
        crossorigin="anonymous"></script>

    <script src="https://connectors.tableau.com/libs/tableauwdc-2.3.latest.js" type="text/javascript"></script>
</head>

<body>
    <div class="container container-table">
        <div class="row vertical-center-row">
            <div class="text-center col-md-4 col-md-offset-4">
                <button type="button" id="submitButton" class="btn btn-success" style="margin: 10px;">Get Strava
                    Data!</button>
            </div>
        </div>
    </div>
</body>

<script type = "text/javascript">
    (function () {
            //Create the connector object
            var myConnector = tableau.makeConnector();

            // Define the schema
            myConnector.getSchema = function (schemaCallback) {
                var cols = [{
                    id: "name",
                    dataType: tableau.dataTypeEnum.string
                }, {
                    id: "type",
                    dataType: tableau.dataTypeEnum.string
                }, {
                    id: "start_date",
                    dataType: tableau.dataTypeEnum.datetime
                }, {
                    id: "distance",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "moving_time",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "average_speed",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "max_speed",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "average_cadence",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "average_heartrate",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "weighted_average_watts",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "kilojoules",
                    dataType: tableau.dataTypeEnum.float
                }, {
                    id: "id",
                    dataType: tableau.dataTypeEnum.int
                }];

                var tableSchema = {
                    id: "strava_activity",
                    alias: "Personal Strava Activity Data",
                    columns: cols
                };

                schemaCallback([tableSchema]);
            };

            // Download the data
            myConnector.getData = function (table, doneCallback) {
                $.getJSON("http://rasp-srv:8000/strava", function (resp) {
                    var feat = resp,
                        tableData = [];

                    // Iterate over the JSON object
                    for (var i = 0, len = feat.length; i < len; i++) {
                        tableData.push({
                            "name": feat[i].name,
                            "type": feat[i].type,
                            "start_date": feat[i].start_date,
                            "distance": feat[i].distance,
                            "moving_time": feat[i].moving_time,
                            "average_speed": feat[i].average_speed,
                            "max_speed": feat[i].max_speed,
                            "average_cadence": feat[i].average_cadence,
                            "average_heartrate": feat[i].average_cadence,
                            "weighted_average_watts": feat[i].weighted_average_watts,
                            "kilojoules": feat[i].kilojoules,
                            "id": feat[i].id
                        });
                    }

                    table.appendRows(tableData);
                    doneCallback();
                });
            };

            tableau.registerConnector(myConnector);

            // Create event listeners for when the user submits the form
            $(document).ready(function () {
                $("#submitButton").click(function () {
                    tableau.connectionName = "Strava Activity Feed"; // This will be the data source name in Tableau
                    tableau.submit(); // This sends the connector object to Tableau
                });
            });
        })();
</script>

</html>

Taking it a step further

Obviously the connector already works. However if we needed to create some for all our tables this would take a while to accomplish. There are a couple of options we can take to make the creation of them faster. One that is extremely easy to accomplish is to just use the previously defined pydantic classes and use that to generate the code.

To do so we will make use of jinja2 templating to create self contained files for all of our connectors.

Converting our schema definition

Assuming a pydantic class as follows:


class StravaActivityCreate(BaseModel):
    name: str
    type: str
    start_date: datetime
    distance: float
    moving_time: int
    average_speed: Optional[int] = None
    max_speed: Optional[float] = None
    average_cadence: Optional[float] = None
    average_heartrate: Optional[float] = None
    weighted_average_watts: Optional[float] = None
    kilojoules: Optional[float] = None
    test_date: date

we first setup our equivalencies between the python types to the tableau ones:

python_to_tableau_dict={
'float':'tableau.dataTypeEnum.float',
'int':'tableau.dataTypeEnum.int',
'datetime':'tableau.dataTypeEnum.datetime',
'date':'tableau.dataTypeEnum.date',
'str':'tableau.dataTypeEnum.string'}

python_to_tableau_dict = defaultdict(lambda: None, python_to_tableau_dict)

We use a default dict to take care of missing cases (we define None as the default response but could set something like undefined instead).

Now all we need to obtain the correct dictionary for our columns is to iterate over the objects __fields__ object.

schema=[]
for k,i in StravaActivityCreate.__fields__.items():
    schema.append({'id':k, 'dataType':python_to_tableau_dict.get(i.type_.__name__)})

We just get the type definition of each item in string form with type_. We now have a dictionary with our table schema definition in terms Tableau will understand.

Jinja2 Template

In order to use jinja2 we need to provide a template. For this template we need to pass the following variables:

  • schema_name: What we want to call our schema
  • schema_description: A description of our table
  • fields: The dictionary containing the schema for our table
  • api_endpoint: The location of our api endpoint

The html part of our template will be exactly the same as the one we created above, just a button with submit. We just use the schema_name to change some of the values.

<head>
    <title>{{schema_name}}</title>
    <meta http-equiv="Cache-Control" content="no-store" />

    <link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" rel="stylesheet"
        integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" crossorigin="anonymous">
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js" type="text/javascript"></script>
    <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"
        integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS"
        crossorigin="anonymous"></script>

    <script src="https://connectors.tableau.com/libs/tableauwdc-2.3.latest.js" type="text/javascript"></script>
</head>

<body>
    <div class="container container-table">
        <div class="row vertical-center-row">
            <div class="text-center col-md-4 col-md-offset-4">
                <button type="button" id="submitButton" class="btn btn-success" style="margin: 10px;">Get {{schema_name}}
                    Data!</button>
            </div>
        </div>
    </div>
</body>

Now for the javascript we will take advantage of a few things. One will be looping in our jinja2 template. So our schema definition in the template will just be:

        myConnector.getSchema = function (schemaCallback) {
            var cols = [
                {% for field in fields -%}
                {id: "{{field.id}}", dataType: {{field.dataType}},},
                {% endfor -%}
                ];
            var tableSchema = {
                id: "{{schema_name}}",
                alias: "{{schema_description}}",
                columns: cols
            };

We will just iterate over every item in our dictionary and write the correct javascript.

The other change we will make will be to take advantage that the elements of our field are named the same as the properties, for example ("name": feat[i].name). So our API call can just be modified.

        myConnector.getData = function (table, doneCallback) {
            $.getJSON("{{api_endpoint}}", function (resp) {
                var feat = resp;
                var tableData = [];

                // Iterate over the JSON object
                for (var i = 0, len = feat.length; i < len; i++) {
                    tableEntry = {};
                    var ref = feat[i]
                    Object.getOwnPropertyNames(ref).forEach(function(val, idx, array){
                        tableEntry[val] = ref[val]
                    });
                    tableData.push(tableEntry);
                }
                table.appendRows(tableData);
                doneCallback();
            });
        };

Using getOwnPropertyNames lets us generalize our method for all of our schemas.

Our whole template will looks as follows:

<html>

<head>
    <title>{{schema_name}}</title>
    <meta http-equiv="Cache-Control" content="no-store" />

    <link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" rel="stylesheet"
        integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" crossorigin="anonymous">
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js" type="text/javascript"></script>
    <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"
        integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS"
        crossorigin="anonymous"></script>

    <script src="https://connectors.tableau.com/libs/tableauwdc-2.3.latest.js" type="text/javascript"></script>
</head>

<body>
    <div class="container container-table">
        <div class="row vertical-center-row">
            <div class="text-center col-md-4 col-md-offset-4">
                <button type="button" id="submitButton" class="btn btn-success" style="margin: 10px;">Get {{schema_name}}
                    Data!</button>
            </div>
        </div>
    </div>
</body>

<script type="text/javascript">
    (function () {
        //Create the connector object
        var myConnector = tableau.makeConnector();

        // Define the schema
        myConnector.getSchema = function (schemaCallback) {
            var cols = [
                {% for field in fields -%}
                {id: "{{field.id}}", dataType: {{field.dataType}},},
                {% endfor -%}
                ];
            var tableSchema = {
                id: "{{schema_name}}",
                alias: "{{schema_description}}",
                columns: cols
            };

            schemaCallback([tableSchema]);
        };

        // Download the data
        myConnector.getData = function (table, doneCallback) {
            $.getJSON("{{api_endpoint}}", function (resp) {
                var feat = resp;
                var tableData = [];

                // Iterate over the JSON object
                for (var i = 0, len = feat.length; i < len; i++) {
                    tableEntry = {};
                    var ref = feat[i]
                    Object.getOwnPropertyNames(ref).forEach(function(val, idx, array){
                        tableEntry[val] = ref[val]
                    });
                    tableData.push(tableEntry);
                }
                table.appendRows(tableData);
                doneCallback();
            });
        };

        tableau.registerConnector(myConnector);

        // Create event listeners for when the user submits the form
        $(document).ready(function () {
            $("#submitButton").click(function () {
                tableau.connectionName = "{{schema_name}} Feed"; // This will be the data source name in Tableau
                tableau.submit(); // This sends the connector object to Tableau
            });
        });
    })();
</script>

</html>

Now its very straight forward, we pass the variables to the template in jinja2 and save the result.

from pydantic import BaseModel
from datetime import datetime, date
from typing import Optional
from collections import defaultdict
from jinja2 import Environment, FileSystemLoader

...

env = Environment(loader=FileSystemLoader(pathlib.Path(__file__).parent))
    template = env.get_template(template_name)
    template_variables = {'schema_name': schema_name,
                          'schema_description': schema_description,
                          'fields': schema,
                          'api_endpoint': api_endpoint}
    html_out = template.render(template_variables)
    with open(file_name, 'w') as fh:
        fh.write(html_out)

Running that program would then generate the correct WDC file to be then served however we want it.

We make this a funciton and then can easily convert all our classes to WDC's.

from pydantic import BaseModel
from collections import defaultdict
from jinja2 import Environment, FileSystemLoader
from datetime import datetime
import pathlib


def gen_wdc_from_pydantic_class(PydanticModel: BaseModel,
                                api_endpoint: str, schema_name: str,
                                schema_description: str,
                                file_name: str,
                                template_name: str) -> None:
    """
    function to generate a WDC file for Tableau from a Pydantic Class.
    This does not address fields define as List. Also aggregation would
    need to be done at the js level, the WDC file can be modified
    to accommodate this.

    Args:
        PydanticModel (BaseModel): Pydantic base model representing the table
        api_endpoint (str): location of the API endpoint
        schema_name (str): name for the table in WDC
        schema_description (str): table description
        file_name (str): desired name of the wdc connector
        template_name (str): location of the template for jinja
    """
    python_to_tableau_dict = {
        'float': 'tableau.dataTypeEnum.float',
        'int': 'tableau.dataTypeEnum.int',
        'datetime': 'tableau.dataTypeEnum.datetime',
        'date': 'tableau.dataTypeEnum.date',
        'str': 'tableau.dataTypeEnum.string'}

    python_to_tableau_dict = defaultdict(lambda: None, python_to_tableau_dict)

    schema = []
    for k, i in PydanticModel.__fields__.items():
        schema.append(
            {'id': k,
             'dataType': python_to_tableau_dict.get(i.type_.__name__)})

    env = Environment(loader=FileSystemLoader(pathlib.Path(__file__).parent))
    template = env.get_template(template_name)
    template_variables = {'schema_name': schema_name,
                          'schema_description': schema_description,
                          'fields': schema,
                          'api_endpoint': api_endpoint}
    html_out = template.render(template_variables)
    with open(file_name, 'w') as fh:
        fh.write(html_out)


if __name__ == "__main__":

    class ExampleSchema(BaseModel):
        name: str
        type: str
        start_date: datetime
        distance: float

    template_name = pathlib.Path(__file__).parent

    print(template_name)

    class_input = {
        "PydanticModel": ExampleSchema,
        "api_endpoint": 'localhost:8000/myendpoint',
        "schema_name": 'MyCoolTable',
        "schema_description": 'A Super Cool Table',
        "file_name": 'MyCoolTableWDC.html',
        "template_name": str('tableau_wdc_template.html')
    }

    gen_wdc_from_pydantic_class(**class_input)

Now from Tableau Desktop we can just import our data using the WDC option.

The completed function can be access on my github deployment/tableau