https://xcancel.com/charliermarsh/status/1884651482009477368

We’re building a new static type checker for Python, from scratch, in Rust.

From a technical perspective, it’s probably our most ambitious project yet. We’re about 800 PRs deep!

Like Ruff and uv, there will be a significant focus on performance.

The entire system is designed to be highly incremental so that it can eventually power a language server (e.g., only re-analyze affected files on code change).

Performance is just one of many goals, though.

For example: we’re investing heavily in strong theoretical foundations and a consistent model of Python’s typing semantics.

(We’re lucky to have @carljm and @AlexWaygood on the team for many reasons, this is one of them.)

Another goal: minimizing false positives, especially on untyped code, to make it easier for projects to adopt a type checker and expand coverage gradually over time, without being swamped in bogus type errors from the start.

We haven’t publicized it to-date, but all of this work has been happening in the open, in the Ruff repository.

All driven by a uniquely great team: @carljm, @AlexWaygood, @sharkdp86, @MichaReiser, @DhruvManilawala, @ibraheemdev, @dcreager.

I’m learning so much from them.

Warning: this project is not ready for real-world user testing, and certainly not for production use (yet). The core architecture is there, but we’re still lacking support for some critical features.

Right now, I’d only recommend trying it out if you’re looking to contribute.

For now, we’re working towards an initial alpha release. When it’s ready, I’ll make sure you know :)

  • alsimoneau@lemmy.ca
    link
    fedilink
    arrow-up
    1
    ·
    12 hours ago

    Thanks for the answer. It is a genuine question.

    But don’t you loose polymorphism? It seems like a big trade-off. For context I’m a scientist doing data analysis and modeling, so my view point is potentially significantly different than most of “the industry”.

    Your points 1-3 are handled by running the code and reading the error messages, if any. For 4-5 “ugly” code will be unreadable wether it’s typed of not. For 6 refactoring now necessitate to change the types everywhere, which I imagine could be error prone and increase code inertia. And for 7 it would definitely slow down developpement untill you get familiar with the libraries and have tooling to automate stuff.

    I can understand the appeal for enterprise code but that kind of project seems doomed to go against the Zen of Python anyways, so it’s probably not the best language for that.

    • FizzyOrange@programming.dev
      link
      fedilink
      arrow-up
      2
      ·
      6 hours ago

      But don’t you loose polymorphism?

      No. You’ll have to be more specific about what kind of polymorphism you mean (it’s an overloaded term), but you can have type unions, like int | str.

      Your points 1-3 are handled by running the code and reading the error messages, if any

      Not unless you have ridiculously exhaustive tests, which you definitely don’t. And running tests is still slower than your editor telling you of your mistake immediately.

      I probably didn’t explain 4-6 well enough if you haven’t actually ever used static types.

      They make it easier to navigate because your IDE now understands your code and you can do things like “find all references”, and “go to definition”. With static types you can e.g. ctrl-click on mystruct.myfield and it will go straight to the definition of myfield.

      They make the code easier to understand because knowing the types of variables tells you a lot of information about what they are and how to use them. You’ll often see in untyped code people add comments saying what type things are anyway.

      Refactoring is easier because your IDE understands your code, so you can do things like renaming variables and moving code and it will update all the things it needs to correctly. Refactoring is also one of those areas where it tends to catch a lot of mistakes. E.g. if you change the type of something or the parameters of a function, it’s very easily to miss one place where it was used.

      I don’t think “you need to learn it” really counts as slowing down development. It’s not that hard anyway.

      I can understand the appeal for enterprise code but that kind of project seems doomed to go against the Zen of Python anyways, so it’s probably not the best language for that.

      It’s probably best not to use Python for anything, but here we are.

      I will grant that data science is probably one of the very few areas where you may not want to bother, since I would imagine most of your code is run exactly once. So that might explain why you don’t see it as worthwhile. For code that is long-lived it is very very obviously worth it.