Skip to main content

v1.1.0 Release

· 4 min read
Founder and CEO

METL 1.1 is released!

This release is an important milestone in the development of METL. With this release, we introduced metl-deploy which is a set of scripts that make it super easy to deploy METL on AWS. My Price Health uses METL on Google Cloud and Health Rosetta uses METL on AWS so metl-deploy takes METL on a multi-cloud journey. Since METL runs on Kubernetes, we are confident it can also easily run on Azure and bare metal too.

What's new?

The principal user-facing feature for METL 1.1 is the introduction of a set of claims loader scripts for Assured Benefits Administrators. This gives us the ability to view post-adjudicated claims data for this TPA's claims data. In addition to the loader, the ABA claims loader was our first load into the claims schema with post-adjudicated claims data, so we needed to also make some modifications to the schema to support the claims load.

As with all other METL database loads, loading data to the database consists of 2 components.

  1. extractor configuration. METL extracts files before loading them. In METL, extracting can mean a number of different things:
    1. Unzip a file. This is a frequent use case for CMS and other standard data files online
    2. Convert data from multiple formats. For now extractor supports fixed-width files and CSV files. We anticipate adding additional file formats over time.
    3. Clean invalid data from data files. It is fairly common with data files from CMS and other sources that there will be spurious invalid characters in them. Our personal favorite is the NPI registry data files. They are around 9 GB unzipped and it can take a long time to load that much data to the database. It's always fun to spend 50 minutes waiting for a file to extract and attempt to load only to find that there is an error on line 1,230,874. If you manually fix that, you'll then find an error on 1,597,145. If we have to manually clean up files like this, a 50 minute load can stretch on for many hours, if not over multiple days of manual cleaning.
    4. Pull data from one or more files in a zip or a single file. Sometimes it's important to be able to combine data from multiple files into a single file (e.g. NPI registry data that spreads name information across multiple files), and sometimes it's important to be able to split a single file into multiple files (e.g. claims data which may repeat the same data on multiple lines, but we don't want to load that data multiple times). If data is in files that looks similar to the tables they will be loaded into, it's a LOT easier to write load scripts.
    5. Write data to an easily bulk-loadable set of CSV format files. By unifying the format for the files that we load to the database, extractor makes it tons easier to build load scripts for csvloader
  2. csvloader configuration and load scripts. METL starts with uniform, clean CSV's, loads them to staging tables of the same names and then runs load scripts to transform them to their final shape.
    1. The first step is for METL to read the load config file. It identifies the order that files will be loaded and what scripts are run. All load scripts are just .sql files and everything gets checked into source control so it's easy to see exactly what's going on at any time.
    2. In the order specified in the config file, load steps are performed. For most load steps, a CSV is loaded into a staging table and then a load script is run to load the data into the production tables.
    3. Because every load file is just SQL, any transformation that you can do with data in SQL, you can do with csvloader

Release details

Claims: Pull Request, Commit hash: 6ebcc3f

METL-deploy: Pull Request, Commit hash: 7daa830

Monorepo (Not public, but added here for documentation. We expect to open source extractor and csvloader in the future): Pull Request