Venue: Walter E Lay Auto Lab – 2052
Alex Wadell is a PhD candidate at the University of Michigan Department of Mechanical Engineering.
Molecular Foundation Models are emerging as a powerful tool for molecular design, material science, and cheminformatics. By leveraging the transformer architecture, these models attempt to learn the language of chemistry and discover robust molecular embeddings. However, current models are constrained by tokenizers that fail to capture the full breadth of chemical space or even the periodic table of elements. In his talk, Alex will introduce smirk, a new tokenizer for molecular foundation models that can represent the entirety of the OpenSMILES specification. We’ll also discuss performance metrics for tokenizers and the results of Alex’s systematic evaluation of thirteen chemistry-specific tokenizers using N-gram language models as a low-cost proxy for transformer models.
If you are unable to attend in person but are interested, please feel free to join virtually.
Join Zoom Meeting
https://umich.zoom.us/j/978235
Meeting ID: 978 2352 7756
Passcode: 2024