Jupyter Code Executor in v0.4 (alternative implementation) #4885

Leon0402 · 2025-01-03T13:41:24Z

Why are these changes needed?

The new v4 version has no support for jupter notebooks yet compared to the old version.

NOTE: This is an alternative implementation to #4795 which uses the nbclient library rather than implementing the protocol ourself. While nbclient is probably not made for that use case, it works quite well with only small hacks. Additionally no explicit external jupyter gateway server is used anymore (which could be removed in the other implementation as well though).
Compared to the other implementation it is much simpler and more importantly much more robust. I extensively tested both implementations in a real life scenario and the other one had problems with getting frequent timeouts and got stuck after running for a while. I debugged it quite a bit, but I had the feeling that providing a robust implementation is quite complicated and better left to experts in that domain (i.e. nblient).

Related issue number

Closes #4792

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

jackgerrits

This looks great! I agree, this feels like the better route to take. Simpler, more robust and offloads the complexity of processing the notebook to a library built for the task!

python/packages/autogen-ext/pyproject.toml

Leon0402 · 2025-01-04T18:09:17Z

@jackgerrits Updated the code, I was missing some local changes that I had done in nbclient.

But there is some memory leak somewhere. When I run my program, I can see memory grow extremely. After 40min it was multiple GB. I run 10 tasks in parallel, where each task is a dialog with up to 20 turns. While I work a lot with images, which could lead to spikes in individual dialogs, the memory consumption should not be increasing with time, but be rather constant or have (unpredictable) spikes.
Unfortunately, I don't really know if it is related to this patch or some general issue in AutoGen. I already spent significant time on debugging, but i find it hard to find the exact cause. Have you experienced any similar issue without my PR here?
I could use some help :)

Some context:

async def run_task(cfg: Config, task: TaskType):
    ...

    semaphore = asyncio.Semaphore(cfg.concurrency_limit)

    async def run_single_sample(task_runner: TaskRunner, task_sample: TaskSample):
        async with semaphore:
            await task_runner.run_agent(task_sample, cfg.output_dir / task.value)

    task_runner = TaskRunner(cfg)
    samples = [run_single_sample(task_runner, task_sample) for task_sample in sliced_samples]
    await tqdm.gather(*samples, desc=f"Task: {task.value}")


async def run_tasks(cfg: Config):
    for task in cfg.tasks:
        await run_task(cfg, task)

This is my code task_runner.run_agent will then define the agents and code executor (in a async with block). So I would generally expect that each task frees up memory shortly after finishing in tqdm.gather.

codecov · 2025-01-04T22:57:43Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.95%. Comparing base (9570e82) to head (8397d22).
Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4885      +/-   ##
==========================================
+ Coverage   68.53%   72.95%   +4.42%     
==========================================
  Files         156      115      -41     
  Lines       10140     6784    -3356     
==========================================
- Hits         6949     4949    -2000     
+ Misses       3191     1835    -1356

Flag	Coverage Δ
unittests	`72.95% <ø> (+4.42%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ekzhu · 2025-01-04T23:02:29Z

@Leon0402 Can you show where your runtime is created? this might be due to the runtime is not removing references to created agents.

To mitigate you might want to create new instances of runtime for each task.

I think we should handle it in a separate PR.

jackgerrits · 2025-01-06T14:28:39Z

Haven't seen this in the main branch myself, would be curious to know what going on. I agree with Eric, sounds plausible

…eExecutor2

Leon0402 · 2025-01-13T21:54:39Z

@ekzhu @jackgerrits Now v4 is released, do you think we could merge this in?

ekzhu · 2025-01-14T01:59:42Z

Thanks. Yes let's make this part of 0.4.2. Let me review it.

python/packages/autogen-ext/src/autogen_ext/code_executors/jupyter/_jupyter_code_executor.py

ekzhu · 2025-01-15T02:02:36Z

@Leon0402 I pushed the changes for API docs, you can take a look, and use it as an example for future PRs: 8397d22

python/packages/autogen-ext/src/autogen_ext/code_executors/jupyter/_jupyter_code_executor.py

…eExecutor2

Leon0402 · 2025-01-17T20:13:14Z

@ekzhu Added another example using CodeExeuctorAgent. Also used a context manager in your second example. While it is possible to use start / stop instead of a context manager, uses should only use it if they know what you they are doing. In the example it would leak resources in the case of exceptions. Thus, only do it like this in a try-catch-finally block or as part of your own context manager. For these simple examples, we should use the recommended and safe way: With Block.

Would it be possible to run the workflows automatically without maintainer approval? It is somewhat annoying that I can only see failing checks with a delay.

ekzhu · 2025-01-17T23:43:19Z

Would it be possible to run the workflows automatically without maintainer approval? It is somewhat annoying that I can only see failing checks with a delay.

We will work on it :)

Meanwhile you can run the checks locally by poe checks from the python directory. For more info see the developer guide under python directory.

For doc build check use poe docs-check under python/packages/autogen-core

Leon0402 added 6 commits January 3, 2025 13:49

Implement local jupyter notebook execution support

e014cb5

Add tests for images and html

c6acfa6

Add missing ipython kernel dep

a1281ac

Make Error Handling consistent with jupyter notebook.

c0a2b76

Use more sensible default for websocket connection

eae31ff

Implementation based on nbclient

4513422

Leon0402 changed the title ~~Jupyter Code Executor in v0.4 (alternative implementation~~ Jupyter Code Executor in v0.4 (alternative implementation) Jan 3, 2025

Leon0402 mentioned this pull request Jan 3, 2025

Local Jupyter Code Executor for Version 4 #4795

Draft

3 tasks

Fix stuff broken through rebase

980bd88

jackgerrits reviewed Jan 3, 2025

View reviewed changes

python/packages/autogen-ext/pyproject.toml Outdated Show resolved Hide resolved

Leon0402 marked this pull request as draft January 4, 2025 08:34

Leon0402 added 2 commits January 4, 2025 19:00

Update dependencies

0a12657

Implementation without changes to nbclient

99ff064

Leon0402 mentioned this pull request Jan 4, 2025

Debug Memory Leak in Autogen #4893

Open

Leon0402 added 2 commits January 13, 2025 22:51

Stop on first failure

362ef05

Merge remote-tracking branch 'upstream/main' into feauture/juputerCod…

9787183

…eExecutor2

Leon0402 marked this pull request as ready for review January 13, 2025 21:55

ekzhu reviewed Jan 14, 2025

View reviewed changes

Implement suggestions

05d69cb

ekzhu reviewed Jan 15, 2025

View reviewed changes

python/packages/autogen-ext/src/autogen_ext/code_executors/jupyter/_jupyter_code_executor.py Show resolved Hide resolved

Add API docs

8397d22

ekzhu added this to the 0.4.2 milestone Jan 15, 2025

ekzhu added the proj-extensions label Jan 15, 2025

Merge branch 'main' into feauture/juputerCodeExecutor2

0ba8f97

jackgerrits reviewed Jan 17, 2025

View reviewed changes

python/packages/autogen-ext/src/autogen_ext/code_executors/jupyter/_jupyter_code_executor.py Show resolved Hide resolved

Leon0402 added 2 commits January 17, 2025 21:05

Add example for CodeExecutorAgent

c178e20

Merge remote-tracking branch 'upstream/main' into feauture/juputerCod…

60202f7

…eExecutor2

Merge branch 'main' into feauture/juputerCodeExecutor2

7eec6a6

ekzhu approved these changes Jan 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jupyter Code Executor in v0.4 (alternative implementation) #4885

Jupyter Code Executor in v0.4 (alternative implementation) #4885

Leon0402 commented Jan 3, 2025

jackgerrits left a comment

Leon0402 commented Jan 4, 2025 •

edited

Loading

codecov bot commented Jan 4, 2025 •

edited

Loading

ekzhu commented Jan 4, 2025

jackgerrits commented Jan 6, 2025

Leon0402 commented Jan 13, 2025

ekzhu commented Jan 14, 2025 •

edited

Loading

ekzhu commented Jan 15, 2025

Leon0402 commented Jan 17, 2025

ekzhu commented Jan 17, 2025 •

edited

Loading

Jupyter Code Executor in v0.4 (alternative implementation) #4885

Are you sure you want to change the base?

Jupyter Code Executor in v0.4 (alternative implementation) #4885

Conversation

Leon0402 commented Jan 3, 2025

Why are these changes needed?

Related issue number

Checks

jackgerrits left a comment

Choose a reason for hiding this comment

Leon0402 commented Jan 4, 2025 • edited Loading

codecov bot commented Jan 4, 2025 • edited Loading

Codecov Report

ekzhu commented Jan 4, 2025

jackgerrits commented Jan 6, 2025

Leon0402 commented Jan 13, 2025

ekzhu commented Jan 14, 2025 • edited Loading

ekzhu commented Jan 15, 2025

Leon0402 commented Jan 17, 2025

ekzhu commented Jan 17, 2025 • edited Loading

Leon0402 commented Jan 4, 2025 •

edited

Loading

codecov bot commented Jan 4, 2025 •

edited

Loading

ekzhu commented Jan 14, 2025 •

edited

Loading

ekzhu commented Jan 17, 2025 •

edited

Loading