software & databases

software summary

all software


Chemical Variational Autoencoder (chemical_VAE) is a free, open-source software for machine learning of molecular properties. chemical_VAE utilizes molecular SMILES that are encoded into a code vector representation and can be decoded from the code representation back to molecular SMILES. The autoencoder may also be jointly trained with property prediction to help shape the latent space. The new latent space can then be optimized upon to find the molecules with the most optimized properties of interest. chemical_VAE is currently being extended in conjunction with MOFid to capture adsorption of molecules in porous materials.

chemical_VAE can be downloaded from:

R. Gómez-Bombarelli, J. Wei, D. Duvenaud, J. Hernández-Lobato, B. Sánchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. Hirzel, R. Adams, and A. Aspuru-Guzik, "Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules," ACS Cent. Sci. 4, 268-276 (2018). DOI: 10.1021/acscentsci.7b00572


Computation-Ready Experimental (CoRE) MOF is a set of databases that enable high-throughput computational screening by using Github's versioning system to manage and curate the data. CoRE MOF 2014 provides cleaned atomic coordinates and pore characteristics of 5109 structures while solving issues such as solvent molecules and partially occupied/disordered atoms in experimental crystal structures. DDEC partial charges and geometry-optimized structures are available for subsets (2900 and 502 MOFs) of the CoRE MOF 2014 database. CoRE MOF 2019 includes 9869 porous MOFs with only free solvent molecules removed and 14,142 porous MOFs with all solvent molecules removed, together with pore characteristics and open-metal-site detection.

The CoRE MOF databases are available for download from


CP2K is a free, open-source quantum chemistry software package designed to perform molecular dynamics and Monte Carlo simulations of clusters and periodic systems. CP2K can be run in both MPI and OpenMP modes, and built-in farming procedures allow for capacity jobs at DOE Leadership Computing Facilities. The NMGC team contributes to CP2K through the development first principles Monte Carlo (FPMC) modules for simulations of phase, adsorption, and chemical equilibria and through algorithms for the incorporation of nuclear quantum effects.

CP2K can be downloaded from: (, and NMGC will make workflows for FPMC simulations of adsorption equilibria available to the community.

Funding for the Siepmann group’s development of modules ( for CP2K through grants from the Department of Energy (DE-FG02-12ER16362 and DE-FG02-17ER16362 for FPMC simulations of adsorption equilibria; Lawrence Livermore National Laboratory for FPMC simulations in the canonical and isobaric-isothermal ensembles) and the National Science Foundation (FPMC simulations of vapor-liquid equilibria and of reaction equilibria) is gratefully acknowledged.

J. Hutter, M. Iannuzzi, F. Schiffmann, and J. VandeVondele, "CP2K: Atomistic simulations of condensed matter systems," WIREs Comput. Mol. Sci. 4, 15-25 (2014). DOI: 10.1002/wcms.1159

E. O. Fetisov, M. S. Shah, J. R. Long, M. Tsapatsis, and J. I. Siepmann, "First principles Monte Carlo simulations of unary and binary adsorption: CO2, N2, and H2O in Mg-MOF-74," Chem. Comm. 54, 10816-10819 (2018). DOI: 10.1039/C8CC06178E

Force Field Database

The database of polymers of intrinsic microporosity (PIMs) contains a collection of over 240,000 amorphous adsorbent conformations that were created from simulated single-component gas phase adsorption at 300 K with polymer rearrangement incorporated. The files provide the atomic coordinates, molecular connectivity, and interatomic potential information required to carry out computational micropore analysis and perform molecular simulations. The 24 adsorbate species investigated responsible for inducing framework restructuring are composed of diverse chemical functionalities, molecular geometries, and sizes. Reference data for species uptake, induced swelling, fractional free volume dilation, and surface area expansion are included.

The database can be accessed from:

Anstine, D.M., Tang, D., Sholl, D.S. et al. "Adsorption space for microporous polymers with diverse adsorbate species," npj Comput. Mater. 7, 53 (2021). DOI: 10.1038/s41524-021-00522-8

Master of Filtering

Master of Filtering (MOF) is a game being developed by the NMGC to engage youth and citizen scientists with concepts of porous materials and separations. A video tutorial of the game is available at The long range goal for this game is the crowdsourcing of nanoporous material design through gamification.


MCCCS-MN is a free, open-source Monte Carlo software tailored for simulations of phase and adsorption equilibria in the Gibbs ensemble using the TraPPE force field. MCCCS-MN is particularly efficient for equilibria involving multiple condensed phases and articulated molecules. MCCCS–MN uses hybrid MPI/OpenMP for parallel execution, has been adapted to processors with high-bandwidth MCDRAM, and workflows with specific I/O handling allow for capacity jobs at DOE Leadership Computing Facilities.

More information about MCCCS-MN can be found at

Funding for the development of MCCCS-MN through grants from the National Science Foundation (simulation of fluid phase equilibria) and the Department of Energy (simulation of adsorption equilibria and high-throughput workflows) is gratefully acknowledged.

MCCCS-Towhee, a more user-friendly version with support for a variety of force fields but slower version, is freely available for download (

P. Bai, M. Y. Jeon, L. Ren, C. Knight, M. W. Deem, M. Tsapatsis, and J. I. Siepmann, "Discovery of optimal zeolites for challenging separations and chemical transformations using predictive materials modeling," Nat. Commun. 6, 5912 (2014). DOI: 10.1038/ncomms6912

MCCCS‒MN is available via a GNU general public license. Specific versions of MCCCS‒MN used for a given publication are made available as part of the Supporting Information of the following publications:

Y. Sun, R. F. DeJaco, and J. I. Siepmann, "Deep neural network learning of complex binary sorption equilibria from molecular simulation data," Chem. Sci. 10, 4377–4388 (2019). DOI: 10.1039/C8SC05340E

T. R. Josephson, R. Singh, M. S. Minkara, and J. I. Siepmann, "Partial molar properties from molecular simulation using multiple linear regression," Mol. Phys. 117, 3589-3602 (2019). DOI: 10.1080/00268976.2019.1648898


MemPy v1.0 is a Python-based software tool for simulating the separation performance of gas separations with spiral wound membranes. It supports a wide variety of types of calculations, including those with variables depending on one or two dimensions, with the Peng-Robinson or ideal gas equation of state, and with a linear or nonlinear description of permeance. The models have been validated by comparing to an experimental system for air separation. As such, the software is useful for process intensification of gas separation with spiral wound membranes.

MemPy v1.0 is free for download at

R. F. DeJaco, K. Loprete, K. Pennisi, S. Majumdar, J. I. Siepmann, P. Daoutidis, H. Murnen, and M. Tsapatsis, "Modeling and simulation of gas separations with spiral‐wound membranes," AlChE Jour. online, e16727 (2020). DOI: 10.1002/aic.16274

Geometries for Minnesota Database 2019

Minnesota Database 2019 comprises of a diverse set of chemical data that can be used for benchmarking electronic structure calculations and/or optimizing density functionals or wave function methods. The reference values of the data have been published [P. Verma et al., J. Phys. Chem. A 123, 2966-2990 (2019);], and the present compendium provides the molecular geometries, basis set information, and settings that we have used for calculations to compare to the reference data. There are 56 subdatabases in Database 2019, and the data include a variety of atomic and molecular properties, including atomization energies, reaction energies, bond dissociation energies, isomerization energies, noncovalent complexation energies, proton affinities, electron affinities, ionization potentials, barrier heights, thermochemistry of hydrocarbons, absolute atomic energies, vertical and adiabatic electronic excitation energies, and geometries of molecules; both main-group and transition-metal-containing systems are present.

More information can be found at:

P. Verma, Y. Wang, S. Ghosh, X. He, and D. G. Truhlar, "Revised M11 Exchange-Correlation Functional for Electronic Excitation Energies and Ground-State Properties," J. Phys. Chem. A 123, 2966-2990 (2019) DOI: 10.1021/acs.jpca.8b11499


MOFid and MOFkey is a system for rapid identification and analysis of metal-organic frameworks. It is an open-source software for deconstructing MOFs into their building blocks and underlying topological network. The code is comprised of three overall parts: a main C++ code for deconstructing MOF structures into their building blocks, Python code to assemble the MOFid/MOFkey identifiers, and various analysis utilities.

MOFid and MOFkey is available at
A version that runs in your web browser is available at

B. J. Bucior, A. S. Rosen, M. Haranczyk, Z. Yao, M. E. Ziebel, O. K. Farha, J. T. Hupp, J. I. Siepmann, A. Aspuru-Guzik, and R. Q. Snurr, "Identification Schemes for Metal–Organic Frameworks To Enable Rapid Search and Cheminformatics Analysis," Cryst. Growth Des. 19, 6682-6697 (2019). DOI: 10.1021/acs.cgd.9b01050

mrh and pDMET

We developed a CASSCF solver for density matrix embedding theory and the localized active space SCF, LASSCF method that iteratively optimizes active space wave functions localized in different fragments of molecular systems. The overall wave function is the product of the localized wave functions. We have also developed a periodic density matrix embedding theory, pDMET.

mrh can be downloaded at:

pDMET can be downloaded at:

H. Q. Pham, M. R. Hermes, and L. Gagliardi, "Periodic Electronic Structure Calculations with the Density Matrix Embedding Theory," J. Chem. Theory Comput. 16 130-140 (2020). DOI: 10.1021/acs.jctc.9b00939


The Nanoporous Materials Adsorption Energy (NMAE) Database is a freely available database currently under development that provides a repository for adsorption energies (internal energy of adsorption, enthalpy of adsorption, Gibbs free energy of adsorption) predicted and measured by the NMGC team.

Visit the NMAE page for more information.

Nanoporous Materials Explorer

The Nanoporous Materials Explorer App is a database containing information on thousands of materials' computational properties. The application aims to present the accumulation of data in a new, interactive way. The Nanoporous Materials Explorer App data are predicted, measured, and maintained by the NMGC in partnership with the Materials Project. Currently, more than 500,000 nanoporous materials have data, such as structures and point charges, recorded in a searchable format through the Nanoporous Materials Explorer.

The App (requires registration) and a detailed manual are available at

Predicted Nanoporous Material Structures

Predicted Nanoporous Material Structures are databases of predicted crystal structures (COFs, PPNs, etc.) which resulted from projects investigated under the NMGC funding (among others). The structures are shared in the form of crystallographical information file (.CIF) archives.

Download and find documentation at


pyIAST is a user-friendly, open-source Python that can fit data into analytical isotherm models or use interpolation to characterize the pure-component adsorption isotherms.

pyIAST is hosted on Github and is free for download at, with additional documents available at Users may also contribute to the source code via the Github system. Communication with pyIAST authors is done via email and Github's messaging system.

C. Simon, B. Smit, and M. Haranczyk, "pyIAST: Ideal Adsorbed Solution Theory (IAST) Python Package," Comput. Phys. Commun. 200, 364-380 (2016). DOI: 10.1016/j.cpc.2015.11.016


Python Isotherm Prediction (PyIsoP) is an open‐source software package that uses a fast and accurate, semi‐analytical algorithm to calculate the adsorption of single‐site molecules in NPMs using energy grids. The method is about 100 times faster than atomistic grand canonical MC simulations and is useful for obtaining quick estimates of adsorption for high‐throughput screening of large databases.

PyIsoP can be downloaded from:

A. Gopalan, B. J. Bucior, N. S. Bobbitt, R. Q. Snurr, "Prediction of hydrogen adsorption in nanoporous materials from the energy distribution of adsorption sites," Mol. Phys. 117, 3683–3694 (2019). DOI: 10.1080/00268976.2019.1658910


PySCF is a free, open-source quantum chemistry and solid-state physics software package designed to perform electronic structure calculations in molecular and periodic systems. The NMGC team contributes to PySCF through the development of approaches for quantum embedding calculations. This code is hosted on Github and is free for download at Future goals are the development of robust quantum methods for highly accurate calculations in large systems.

Q. Sun, T. C. Berkelbach, N. S. Blunt, G. H. Booth, S. Guo, Z. Li, J. Liu, J. McClain, E. R. Sayfutyarova, S. Sharma, S. Wouters, and G. K.-L. Chan, "PySCF: the Python‐based simulations of chemistry framework," WIREs Comput. Mol. Sci. 8, e1340 (2018). DOI: 10.1002/wcms.1340

D. V. Chulhai and J. D. Goodpaster, "Projection-Based Correlated Wave Function in Density Functional Theory Embedding for Periodic Systems," J. Chem. Theory Comput. 14, 1928-1942 (2018). DOI: 10.1021/acs.jctc.7b01154


pySIMM is an open-source object-oriented Python package for molecular simulations. It handles data organization for particles, force field parameters, and simulation settings so you can focus on developing your simulation workflow. Create long linear polymer chains structures modeled with common atomistic force fields. pySIMM features LAMMPS integration, Cassandra integration, support for Common Atomistic Force Fields and integration with dedicated software for geometry analysis. Documentation and download are available at the pySIMM website

M. E. Fortunato and C. M.Colina, "pysimm: A python package for simulation of molecular systems," SoftwareX 6, 7-12 (2017). DOI: 10.1016/j.softx.2016.12.002


QMMM is a computer program for performing single-point calculations (energies, gradients, and Hessians), geometry optimizations, and molecular dynamics using combined quantum mechanics (QM) and molecular mechanics (MM) methods. The boundary between the QM and MM regions can be treated by a number of schemes, including the redistributed charge (RC) scheme, the redistributed charge and dipole (RCD) scheme, the polarized-boundary RC (PBRC) scheme, the polarized-boundary RCD (PBRCD) scheme, the flexible-boundary RC (FBRC), and the flexible-boundary RCD (FBRCD) scheme. QMMM calls a QM package and an MM package to perform required single-level calculations. QMMM was tested with GAMESS, Gaussian (both Gaussian 09 and Gaussian 16), and ORCA for the QM package and with TINKER for the MM package; it contains 156 sample runs that can be used to learn and test the program.

After completing a free license form, the QMMM can be freely downloaded from:

QMMM 2017 by H. Lin, Y. Zhang, S. Pezeshki, B. Wang, X.-P. Wu, L. Gagliardi, and D. G. Truhlar, University of Minnesota, Minneapolis, 2017.

QMOF Database

The Quantum Metal–Organic Framework (QMOF) database contains quantum-chemical properties for over 14,000 experimental MOF crystal structures, computed using periodic density functional theory calculations.

More information can be found at the QMOF GitHUB repository

A. Rosen, S. Iyer, D. Ray, Z. Yao, A. Aspuru-Guzik, L. Gagliardi, J.M. Notestein, and R.Q. Snurr, "Machine Learning the Quantum‑Chemical Properties of Metal–Organic Frameworks for Accelerated Materials Discovery," Matter 4 (5) 1578–1597 (2021). DOI: 10.1016/j.matt.2021.02.015


QSoME (Quantum Solid state and Molecular Embedding) is a free, open-source quantum chemistry software designed to perform quantum embedding calculations in molecular and periodic systems. The code can be used to calculate adsorption energies of small molecules in MOFs and zeolites. Example calculations are provided.

QSoME can be downloaded at:

D.S. Graham, X. Wen, D.V. Chulhai, and J.D. Goodpaster, "Robust, Accurate, and Efficient: Quantum Embedding Using the Huzinaga Level-Shift Projection Operator for Complex Systems," J. Chem. Theory Comput. 16 (4) 2284–2295 (2020). DOI: 10.1021/acs.jctc.9b01185


RASPA is a software package for simulating adsorption and diffusion of molecules in flexible nanoporous materials. The code implements the latest state-of-the-art algorithms for molecular dynamics and Monte Carlo in various ensembles. Applications of RASPA include computing coexistence properties, adsorption isotherms for single and multiple components, self- and collective diffusivities, and visualization. RASPA is particularly efficient for gas adsorption in a wide variety of porous materials. The NMGC team contributes to the development of RASPA.

RASPA is available for download from a git server. Information on RASPA is provided at

D. Dubbeldam, S. Calero, D. E. Ellis, and R. Q. Snurr, "RASPA: Molecular simulation software for adsorption and diffusion in flexible nanoporous materials," Molec. Sim. 42, 81-101 (2016). DOI: 10.1080/08927022.2015.1010082


SorbMetaML is an open‐source meta-learning model for the prediction of unary adsorption for nanoporous materials based on example adsorption data for a material. SorbMetaML has been used to identify the optimal hydrogen storage temperature with the highest working capacity for a given pressure difference for diverse nanoporous materials. Datasets for the hydrogen adsorption of all-silica zeolites, hyper-cross-linked polymers, and metal-organic frameworks are provided.

SorbMetaML can be downloaded from:


SorbNet is an open‐source deep neural network for the prediction of adsorption data for binary mixtures over large temperature and pressure ranges that can be used to optimize adsorption/desorption conditions. Example datasets and Python notebooks are provided.

SorbNet can be downloaded from:

Y. Sun, R. F. DeJaco, and J. I. Siepmann, "Deep Neural Network Learning of Complex Binary Sorption Equilibria from Molecular Simulation Data," Chem. Sci. 10, 4377–4388 (2019). DOI: 10.1039/C8SC05340E


SupramolecularVAE is an open-source multi-component variational autoencoder for the property-guided inverse design of reticular frameworks including metal-organic frameworks and covalent-organic frameworks. Example datasets and Python notebooks are provided. The NMGC team is the sole developer.

For more information or to download SupramolecularVAE go to:


Zeo++ is an open-source software for performing high-throughput geometry-based analysis of porous materials and their voids. Future plans for Zeo++ include the addition of functionality for hard and soft nanoporous materials. Zeo++ serves approximately 800 registered users, who can communicate with the developer via email.

Registration is required by LBNL for downloading the code from

T. F. Willems, C. H. Rycroft, M. Kazi, J. C. Meza, and M. Haranczyk, "Algorithms and tools for high-throughput geometry-based analysis of crystalline porous materials," Microporous Mesoporous Mater. 149, 134-141 (2012). DOI: 10.1016/j.micromeso.2011.08.020