This documentation is for Jedi developers who want to improve Jedi itself, but have no idea how Jedi works. If you want to use Jedi for your IDE, look at the plugin api.
This page tries to address the fundamental demand for documentation of the Jedi interals. Understanding a dynamic language is a complex task. Especially because type inference in Python can be a very recursive task. Therefore Jedi couldn’t get rid of complexity. I know that simple is better than complex, but unfortunately it sometimes requires complex solutions to understand complex systems.
Since most of the Jedi internals have been written by me (David Halter), this introduction will be written mostly by me, because no one else understands to the same level how Jedi works. Actually this is also the reason for exactly this part of the documentation. To make multiple people able to edit the Jedi core.
In five chapters I’m trying to describe the internals of Jedi:
Testing is not documented here, you’ll find that right here.
The Jedi Core¶
The core of Jedi consists of three parts:
Parser tries to convert the available Python code in an easy to read
format, something like an abstract syntax tree. The classes who represent this
tree, are sitting in the
The Python module
tokenize is a very important part in the
because it splits the code into different words (tokens). Sometimes it looks a
bit messy. Sorry for that! You might ask now: “Why didn’t you use the
module for this? Well,
ast does a very good job understanding proper Python
code, but fails to work as soon as there’s a single line of broken code.
There’s one important optimization that needs to be known: Statements are not
being parsed completely.
Statement is just a representation of the tokens
within the statement. This lowers memory usage and cpu time and reduces the
complexity of the
Parser (there’s another parser sitting inside
Statement, which produces
Parser Tree (parser/tree.py)¶
If you know what an abstract syntax tree (AST) is, you’ll see that this module is pretty much that. The classes represent syntax elements like functions and imports.
This is the “business logic” part of the parser. There’s a lot of logic here that makes it easier for Jedi (and other libraries to deal with a Python syntax tree.
By using get_code on a module, you can get back the 1-to-1 representation of the input given to the parser. This is important if you are using refactoring.
The easiest way to play with this module is to use
parsing.Parser.module holds an instance of
>>> from jedi._compatibility import u >>> from jedi.parser import ParserWithRecovery, load_grammar >>> parser = ParserWithRecovery(load_grammar(), u('import os'), 'example.py') >>> submodule = parser.module >>> submodule <Module: example.py@1-1>
Any subclasses of
Module has an attribute
>>> submodule.imports [<ImportName: import os@1,0>]
For static analysis purposes there exists a method called
nodes_to_execute on all nodes and leaves. It’s documented in the static
Class inheritance diagram:
Evaluation of python code (evaluate/__init__.py)¶
Evaluation of Python code in Jedi is based on three assumptions:
- The code uses as least side effects as possible. Jedi understands certain list/tuple/set modifications, but there’s no guarantee that Jedi detects everything (list.append in different modules for example).
- No magic is being used:
- writing to
- The programmer is not a total dick, e.g. like this :-)
The actual algorithm is based on a principle called lazy evaluation. If you
don’t know about it, google it. That said, the typical entry point for static
analysis is calling
eval_statement. There’s separate logic for
autocompletion in the API, the evaluator is all about evaluating an expression.
Now you need to understand what follows after
make an example:
import datetime datetime.date.toda# <-- cursor here
First of all, this module doesn’t care about completion. It really just cares
datetime.date. At the end of the procedure
To visualize this (simplified):
Evaluator.eval_statementdoesn’t do much, because there’s no assignment.
Evaluator.eval_elementcares for resolving the dotted path
Evaluator.find_typessearches for global definitions of datetime, which it finds in the definition of an import, by scanning the syntax tree.
- Using the import logic, the datetime module is found.
find_typesis called again by
dateinside the datetime module.
Now what would happen if we wanted
datetime.date.foo.bar? Two more
find_types. However the second call would be ignored, because the
first one would return nothing (there’s no foo attribute in
What if the import would contain another
ExprStmt like this:
from foo import bar Date = bar.baz
Well... You get it. Just another
eval_statement recursion. It’s really
easy. Python can obviously get way more complicated then this. To understand
tuple assignments, list comprehensions and everything else, a lot more code had
to be written.
Jedi has been tested very well, so you can just start modifying code. It’s best to write your own test first for your “new” feature. Don’t be scared of breaking stuff. As long as the tests pass, you’re most likely to be fine.
I need to mention now that lazy evaluation is really good because it only evaluates what needs to be evaluated. All the statements and modules that are not used are just being ignored.
Evaluation Representation (evaluate/representation.py)¶
Like described in the
there’s a need for an ast like module to represent the states of parsed
But now there are also structures in Python that need a little bit more than
Instance for example is only a
Class before it is
instantiated. This class represents these cases.
So, why is there also a
Class class here? Well, there are decorators and
they change classes in Python 3.
Representation modules also define “magic methods”. Those methods look like
py__foo__ and are typically mappable to the Python equivalents
and others. Here’s a list:
|py__call__(params: Array)||On callable objects, returns types.|
|py__bool__()||Returns True/False/None; None means that there’s no certainty.|
|py__bases__()||Returns a list of base classes.|
|py__mro__()||Returns a list of classes (the mro).|
|py__iter__()||Returns a generator of a set of types.|
|py__class__()||Returns the class of an instance.|
|py__getitem__(index: int/str)||Returns a a set of types of the index. Can raise an IndexError/KeyError.|
|py__file__()||Only on modules. Returns None if does not exist.|
|py__package__()||Only on modules. For the import system.|
|py__path__()||Only on modules. For the import system.|
|py__get__(call_object)||Only on instances. Simulates descriptors.|
Name resolution (evaluate/finder.py)¶
Searching for names with given scope and name. This is very central in Jedi and
Python. The name resolution is quite complicated with descripter,
If you want to understand name resolution, please read the first few chapters in http://blog.ionelmc.ro/2015/02/09/understanding-python-metaclasses/.
Flow checks are not really mature. There’s only a check for
would check whether a flow has the form of
if isinstance(a, type_or_tuple).
Unfortunately every other thing is being ignored (e.g. a == ‘’ would be easy to
check for -> a is a string). There’s big potential in these checks.
API (api.py and api_classes.py)¶
The API has been designed to be as easy to use as possible. The API documentation can be found here. The API itself contains little code that needs to be mentioned here. Generally I’m trying to be conservative with the API. I’d rather not add new API features if they are not necessary, because it’s much harder to deprecate stuff than to add it later.
Core Extensions is a summary of the following topics:
These topics are very important to understand what Jedi additionally does, but they could be removed from Jedi and Jedi would still work. But slower and without some features.
Iterables & Dynamic Arrays (evaluate/iterable.py)¶
To understand Python on a deeper level, Jedi needs to understand some of the dynamic features of Python like lists that are filled after creation:
Contains all classes and functions to deal with lists, dicts, generators and iterators in general.
If the content of an array (
list) is requested somewhere, the
current module will be checked for appearances of
arr.insert, etc. If the
arr name points to an actual array, the
content will be added
This can be really cpu intensive, as you can imagine. Because Jedi has to
append and check wheter it’s the right array. However this
works pretty good, because in slow cases, the recursion detector and other
settings will stop this process.
It is important to note that:
- Array modfications work only in the current module.
- Jedi only checks Array additions;
list.pop, etc are ignored.
Parameter completion (evaluate/dynamic.py)¶
One of the really important features of Jedi is to have an option to understand code like this:
def foo(bar): bar. # completion here foo(1)
There’s no doubt wheter bar is an
int or not, but if there’s also a call
foo('str'), what would happen? Well, we’ll just show both. Because
that’s what a human would expect.
It works as follows:
- Jedi sees a param
- search for function calls named
- execute these calls and check the input. This work with a
Diff Parser (parser/diff.py)¶
Basically a contains parser that is faster, because it tries to parse only parts and if anything changes, it only reparses the changed parts.
It works with a simple diff in the beginning and will try to reuse old parser fragments.
Docstrings are another source of information for functions and classes.
jedi.evaluate.dynamic tries to find all executions of functions, while
the docstring parsing is much easier. There are two different types of
docstrings that Jedi understands:
For example, the sphinx annotation
:type foo: str clearly states that the
As an addition to parameter searching, this module also provides return annotations.
Introduce some basic refactoring functions to Jedi. This module is still in a very early development stage and needs much testing and improvement.
I won’t do too much here, but if anyone wants to step in, please do. Refactoring is none of my priorities
It uses the Jedi API and supports currently the following functions (sometimes bug-prone):
- extract variable
- inline variable
Imports & Modules¶
Compiled Modules (evaluate/compiled.py)¶
Imitate the parser representation.
jedi.evaluate.imports is here to resolve import statements and return
the modules/classes/functions/whatever, which they stand for. However there’s
not any actual importing done. This module is about finding modules in the
filesystem. This can be quite tricky sometimes, because Python imports are not
always that simple.
This module uses imp for python up to 3.2 and importlib for python 3.3 on; the correct implementation is delegated to _compatibility.
This module also supports import autocompletion, which means to complete
from datetim (curser at the end would return
Caching & Recursions¶
This caching is very important for speed and memory optimizations. There’s nothing really spectacular, just some decorators. The following cache types are available:
- module caching (load_parser and save_parser), which uses pickle and is
really important to assure low load times of modules like
time_cachecan be used to cache something for just a limited time span, which can be useful if there’s user interaction and the user cannot react faster than a certain time.
This module is one of the reasons why Jedi is not thread-safe. As you can see there are global variables, which are holding the cache information. Some of these variables are being cleaned after every API usage.
Recursions are the recipe of Jedi to conquer Python code. However, someone must stop recursions going mad. Some settings are here to make Jedi stop at the right time. You can read more about them here.
jedi.evaluate.cache this module also makes Jedi not
execution_recursion_decorator uses class variables to
count the function calls.
Most other modules are not really central to how Jedi works. They all contain relevant code, but you if you understand the modules above, you pretty much understand Jedi.
Python 2/3 compatibility (_compatibility.py)¶
To ensure compatibility from Python
3.3, a module has been
created. Clearly there is huge need to use conforming syntax.