Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Refactor runtime executor for better extensibility/configurability #6297

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cshimmin
Copy link

@cshimmin cshimmin commented Jan 15, 2025

Overview

This PR refactors the action_execution_server and ActionExecutor, to facilitate extensibility and customization of the executor within the runtime. In particular:

  • Definition of the executor class is moved out of the action_executor_server.py script, leaving it to focus on the API server implementation.
  • The ActionExecutor class has been refactored into a hierarchy:
    • RuntimeExecutor in runtime/executor/base.py handles the setup/teardown of the common resources: bash session, browser, plugins, etc.
    • BaseActionExecutor in runtime/executor/action_executor.py adds the dynamic run_action dispatcher.
    • ActionExecutor in the same file, adds the detailed implementation for all of the standard runners (run_ipython, read, browse, etc).

Importantly, the executor class to be used as been added to the AppConfig and can be used to specify an arbitrary class (even from an external module) to provide the executor implementation, as long as it implements the base interface. For example, you can set runtime_executor = "my_package.some_module:MyCustomExecutor".

Note that I have not changed any code/functionality of the existing ActionExecutor, just distributed its code around into the three-class hierarchy. There are minor changes elsewhere in the code to accommodate the new configuration setting.

Rationale

This change makes it possible to easily extend/override/customize the builtin executor implementation, without having to operate on core classes in the OpenHands repo (or even make change to the repo at all).

Caveats

In the exiting ActionExecutor, there is a block of code labeled "TODO" in the ainit method, which evidently is meant to be removed pending an unrelated refactoring operation. This temporary block of code depends on the run_ipython method, which is not implemented in the base class. Rather that introducing additional abstractions to accommodate this temporary code, I have elected to leave it in place in the base implementation. However this means executor classes will fail to initialize unless they provide the run_ipython method (as the default ActionExecutor does).

A more general fix would be to add an overridable hook method that can be called in the ainit function, which I'm happy to add if deemed necessary.

In the the base RuntimeExecutor in runtime/executor/base.py, there

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Runtime executor can now be subclassed to provide custom or extended functionality.
The executor class used in the runtime can be configured with:

[core]
runtime_executor = "your_package.your_module:YourCustomExecutorClass"

Give a summary of what the PR does, explaining any non-trivial design decisions


Link of any specific issues this addresses

Extracts runtime execution logic into dedicated modules for better organization and extensibility:
- Splits ActionExecutor into base and implementation classes
- Adds support for custom executor implementations via config
- Maintains backward compatibility with existing execution logic
- Allows specifying executor class from arbitrary python modules via new core.runtime_executor config.

This change enables easier customization of runtime behavior and cleaner separation of concerns.
@rbren
Copy link
Collaborator

rbren commented Jan 17, 2025

Hmm I think we've gone one step too far down the abstraction rabbit hole here

The Runtime class was meant to be something like your ActionExecutor here. Then we created a version of the Runtime that starts a remote ActionExecutionServer, and interfaces w/ it via HTTP. And now all the existing Runtime implementations do this.

What's the end goal here? What exactly are you trying to customize?

@cshimmin
Copy link
Author

cshimmin commented Jan 18, 2025

We are building a custom Agent that has its own Actions. So we have our own classes in agenthub, and in runtime/impl. Those are fairly extensible, in that they don't really care what kind of Action they are passing around. But the ActionExecutor is basically hard-coded to handle only the Actions in the base implementation. We can extend the ActionExecutor, and add run_<custom-action>() methods, but there's no way to tell OH to use that derived class; it's hard coded to use only ActionExecutor in the action_execution_server.py. Which means for now, we are hacking directly on this file, which sees quite a lot of updates each release.

Obviously we could implement a custom runtime that doesn't use action_execution_server, but there's a lot of useful infrastructure (i.e. the fastapi service, which is entirely defined in the if __name__ == "__main__" block, so we can't reuse it by importing), that works totally fine for our purposes. So, I have added a command line argument to the action_execution_server that lets you tell what class you'd like to use. I can remove the changes to ActionExecutor, and leave the class definition in this executable script, if you like.

The main thing we really need is the ability to tell the execution server what class to instantiate as the ActionExecutor, so I'm happy to pare it back to just that if you feel strongly about it. The rest of the refactoring was just about separating the runtime setup of the ActionExecutor from the per-action handlers. We either redefine, or don't need, those run_*() handlers. But they don't really get in our way so it's not a big deal.

FWIW I was also going to propose a PR to make the Agent and Runtime classes configurable by specifying an import path. Currently the runtime instantiation is handled with a string comparison lookup in runtime/__init__.py, which means at a minimum that file has to be hacked to support customizations. The situation with Agent is a bit better, but you still have to hack at some entry point (e.g. agenthub/__init__.py, or an alternative to server/listen.py) to ensure you call register() on your custom Agent. In both cases, it would be nice to be able to simply add (or append) package.module:ClassName in the config file and ensure the requested implementation gets instantiated or registered at the appropriate time.

@enyst
Copy link
Collaborator

enyst commented Jan 18, 2025

@cshimmin Just a quick note on one aspect: regarding the agent, I do think we should fix this. You're right, the code is almost where we need it, it's just hard-codes names instead of loading whatever it is in agenthub, or alternatively your suggestion. If you want to submit a PR on that, it will be welcome!

I'll defer to @rbren on the runtime design, but thank you for raising these points, they're worth thinking about, and sorry for the merge conflicts!

@cshimmin
Copy link
Author

@enyst sure thing, I'll send a PR for that tomorrow likely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants