Can Codon ‘Turbocharge Python’s Notoriously Slow Compiler’?

IEEE Spectrum reports on Codon, a Python compiler specifically developed to, as they put it, “turbocharge Python’s Notoriously slow compiler.”

“We do type checking during the compilation process, which lets us avoid all of that expensive type manipulation at runtime,” says Ariya Shajii, an MIT CSAIL graduate student and lead author on a recent paper about Codon.

Without any unnecessary data or type checking during runtime, Codon results in zero overhead, according to Shajii. And when it comes to performance, “Codon is typically on par with C++. Versus Python, what we usually see is 10 to 100x improvement,” he says. But Codon’s approach comes with its trade-offs. “We do this static type checking, and we disallow some of the dynamic features of Python, like changing types at runtime dynamically,” says Shajii. “There are also some Python libraries we haven’t implemented yet….”

Codon was initially designed for use in genomics and bioinformatics. “Data sets are getting really big in these fields, and high-level languages like Python and R are too slow to handle terabytes per set of sequencing data,” says Shajii. “That was the gap we wanted to fill — to give domain experts who are not necessarily computer scientists or programmers by training a way to tackle large data without having to write C or C++ code.” Aside from genomics, Codon could also be applied to similar applications that process massive data sets, as well as areas such as GPU programming and parallel programming, which the Python-based compiler supports. In fact, Codon is now being used commercially in the bioinformatics, deep learning, and quantitative finance sectors through the startup Exaloop, which Shajii founded to shift Codon from an academic project to an industry application.

To enable Codon to work with these different domains, the team developed a plug-in system. “It’s like an extensible compiler,” Shajii says. “You can write a plug-in for genomics or another domain, and those plug-ins can have new libraries and new compiler optimizations….” In terms of what’s next for Codon, Shajii and his team are currently working on native implementations of widely used Python libraries, as well as library-specific optimizations to get much better performance out of these libraries. They also plan to create a widely requested feature: a WebAssembly back end for Codon to enable running code on a Web browser.

Read more of this story at Slashdot.