Skip to content

Sequencing Analysis and Data Library for Immunoinformatics Exploration

SADIE

About


Documentations: https://sadie.jordanrwillis.com

Source Code: https://github.com/jwillis0720/sadie

Colab: https://colab.research.google.com/github/jwillis0720/sadie


SADIE is the Sequencing Analysis and Data library for Immunoinformatics Exploration. The key features include:

  • Provide pre-built command line apps for popular immunoinformatics applications.

  • Provide a low-level API framework for immunoinformatics developers to build higher level tools.

  • Provide a testable and reusable library that WORKS!

  • Provide a customizable and verified germline reference library.

  • Maintain data formats consistent with standards governed by the AIRR community

  • Portability ready to use out of the box.

SADIE is billed as a "complete antibody library", not because it aims to do everything but because it aims to meet the needs of all immunoinformatics users. SADIE contains both low, mid, and high level functionality for immunoinformatics tools and workflows. You can use SADIE as a framework to develop your own tools, use many of the prebuilt contributed tools, or run it in a notebook to enable data exploration. In addition, SADIE aims to port all code to Python because it relies heavily on the Pandas library, the workhorse of the data science/machine learning age.

Installation


Installation is handled using the Python package installer pip

$ pip install sadie-antibody
---> 100%

Installation with M1 or M2 Macs


If you use the Apple M1 version of Conda, please create your SADIE environment using the following.

$ conda create -n sadie python=3.10.6
---> 100%
$ conda activate sadie
$ pip install sadie-antibody
$ conda install -c conda-forge biopython
---> 100%

Note

You must install biopython from conda since pip install sadie-antibody will not install the proper version of biopython.

For additional help, please file an issue on the SADIE GitHub.

Quick Usage

Consult the documentation for complete usage. Or checkout our Colab notebook

Annotate antibody sequences only from functional human IMGT antibodies to a gzip output

$ sadie airr -n human my_sequences.fasta output.csv

Use the SADIE library to annotate sequences

# import the SADIE Airr module
from sadie.airr import Airr

# define a single sequence
pg9_seq = "CAGCGATTAGTGGAGTCTGGGGGAGGCGTGGTCCAGCCTGGGTCGTCCCTGAGACTCTCCTGTGCAGCGTCCGGATTCGACTTCAGTAGACAAGGCATGCACTGGGTCCGCCAGGCTCCAGGCCAGGGGCTGGAGTGGGTGGCATTTATTAAATATGATGGAAGTGAGAAATATCATGCTGACTCCGTATGGGGCCGACTCAGCATCTCCAGAGACAATTCCAAGGATACGCTTTATCTCCAAATGAATAGCCTGAGAGTCGAGGACACGGCTACATATTTTTGTGTGAGAGAGGCTGGTGGGCCCGACTACCGTAATGGGTACAACTATTACGATTTCTATGATGGTTATTATAACTACCACTATATGGACGTCTGGGGCAAAGGGACCACGGTCACCGTCTCGAGC"

# setup API object
airr_api = Airr("human")

# run sequence and return airr table with sequence_id and sequence
airr_table = airr_api.run_single("PG9", pg9_seq)

# write airr table to tsv or tsv.gz/bz
airr_table.to_airr("PG9 AIRR.tsv")

# compress your airr table into a bzip or gzip filecxx
airr_table.to_airr("PG9 AIRR.tsv.gz")
airr_table.to_airr("PG9 AIRR.tsv.bz2")

License

License

  • Copyright © Jordan R. Willis, Troy Sincomb, and Caleb K Kibet