In the course of working on my book, I wanted to build an easy-to-use website for outlier detection. The idea here is that I have a REST API to perform the outlier detection work but I’d like something a little easier to read than JSON blobs coming out of Postman. That’s where Streamlit comes into play.

Setting Expectations with Streamlit

Streamlit is a simple-to-use web server which allows you to create web apps in Python. It mainly targets data scientists and people working on internal applications. This isn’t a replacement for a heavyweight product like Django (for web applications) or Dash (for dashboarding). Instead, it’s designed to get a good-enough site up and running quickly and that’s exactly what I want.

I’m not sure whether this would be an acceptable solution for internal corporate applications but I could see it being useful enough for small teams needing to display data quickly and easily, as it integrates really well with a variety of Python plotting services and has built-in capabilities to handle Pandas DataFrames.

Installation

Installing Streamlit is rather easy: pip install streamlit will do the trick. From there, I created a file called site.py and got going. Streamlit includes a cheat sheet and some good documentation, making it easy to get started.

Building a Site

First up, we’re going to need some import statements.

import streamlit as st
import requests
import pandas as pd
import json
import plotly.express as px

Streamlit is obvious. The requests library allows us to make web requests, which is how we’ll hit the outlier detection service. Plotly.Express is a library which makes it easy to display Plotly charts.

Writing out text on a page is easy using the write() function in Streamlit. It accepts markdown and converts it to HTML for us. For example, here’s a bit of markdown providing instructions on using the outlier detection engine:

def main():
    st.write(
    """
    # Finding Ghosts in Your Data

    This is an outlier detection application based on the book Finding Ghosts in Your Data (Apress, 2022).  The purpose of this site is to provide a simple interface for interacting with the outlier detection API we build over the course of the book.

    ## Instructions
    First, select the method you wish to use for outlier detection.  Then, enter the dataset you wish to process.  This dataset should be posted as a JSON array with the appropriate attributes.  The specific attributes you need to enter will depend on the method you chose above.

    If you switch between methods, you will see a sample dataset corresponding to the expected structure of the data.  Follow that pattern for your data.
    """
    )
Place tab A into slot B. Easy.

Something I particularly like about Streamlit is that you don’t have to think about a postback model or how website interaction occurs. Instead, write it like it was just imperative code and Streamlit will figure things out. For example, here I create a drop-down list with the outlier detection methods we will implement in the book.

method = st.selectbox(label="Choose the method you wish to use.", options = ("univariate", "multivariate", "single_timeseries", "multi_timeseries"))

Then, in order to work with the results, all I need to do is write a simple if statement:

if method == "univariate":
        starting_data_set = """[
        {"key": "1","value": 1},
        {"key": "2", "value": 2},
        {"key": "3", "value": 3},
        {"key": "4", "value": 4},
        {"key": "5", "value": 5},
        {"key": "6", "value": 6},
        {"key": "8", "value": 95}
    ]
        """
elif method == "multivariate":
    starting_data_set = "TODO - Multivariate"
elif method == "single_timeseries":
    starting_data_set = "TODO - Single time series"
elif method == "multi_timeseries":
    starting_data_set = "TODO - Multi time series"
else:
    starting_data_set = "Select a method."
input_data = st.text_area(label = "Data to process (in JSON format):", value=starting_data_set, height=300)

This lets me pre-populate a text box based on which option people choose, giving an example of what we expect in our dataset.

You chose…wisely.

Buttons are even easier to implement. I want a button which does something on click and it’s just as simple as:

if st.button(label="Detect!"):

Inside there, we can do all of the fancy work without having to think about postback, AJAX calls, or writing bunches of JavaScript.

Writing Things Out

Streamlit has some nice capabilities for handling output. For example, here’s a fancy scatter plot that I created in Plotly:

colors = {True: 'red', False: 'blue'}
g = px.scatter(df, x=df["value"], y=df["anomaly_score"], color=df["is_anomaly"], color_discrete_map=colors,
            symbol=df["is_anomaly"], symbol_sequence=['square', 'circle'],
            hover_data=["sds", "mads", "iqrs", "grubbs", "gesd", "dixon"], log_x=True)
st.plotly_chart(g, use_container_width=True)

The plotly_chart() function takes a graphic and writes it to screen. The best part is that this chart is fully interactive, just as you’d expect with Plotly.

That looks outlierish, yep.

If you have a DataFrame, writing to a table is really easy. Let’s say we have a DataFrame called tbl. We can output the data as a table with st.write(tbl). We can also write out lists, dictionaries, and other collections using the write() function. When you have things like dictionaries, the write() function will output the data in JSON format but at least it’s color-coded and includes collapsible segments, like this example of my diagnostic details:

It’s all so very fancy.

Conclusion

What I’ve shown here is just scratching the surface of Streamlit. There’s quite a bit more to the library and I highly recommend it for rapid application development for services which do not go to “real” production.

2 thoughts on “Having Fun with Streamlit

Leave a comment