memory.enable_memory_arena_shrinkage is not working in python #23339

vsbaldeev · 2025-01-13T14:25:51Z

Describe the issue

I'm trying to reduce the memory consumption of a process using onnxruntime InferenceSession. To achieve this, I call InferenceSession.run with memory.enable_memory_arena_shrinkage, but it doesn't seem to have any effect. How do I do this in python?

To reproduce

import uuid

import onnxruntime
import psutil
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import StringTensorType
from sklearn.ensemble import RandomForestClassifier
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.pipeline import Pipeline
import numpy

def create_trained_model():
    x = [str(uuid.uuid4()) for _ in range(100)]
    y = ["a" for _ in range(50)] + ["b" for _ in range(50)]

    pipeline = Pipeline(
        steps=[
            ("vectorizer", CountVectorizer()),
            ("classifier", RandomForestClassifier())
        ]
    )

    pipeline.fit(x, y)
    onnx_model_proto_string = convert_sklearn(
        pipeline,
        initial_types=[('features', StringTensorType((None,)))],
        verbose=False
    ).SerializeToString()

    return onnxruntime.InferenceSession(onnx_model_proto_string, providers=["CPUExecutionProvider"])



def get_one_prediction(onnx_model, input_data, shrink_arena=False):
    if shrink_arena:
        run_options = onnxruntime.RunOptions()
        run_options.add_run_config_entry("memory.enable_memory_arena_shrinkage", "cpu:0")
    else:
        run_options = None

    return onnx_model.run(
        [onnx_model.get_outputs()[1].name],
        {onnx_model.get_inputs()[0].name: input_data},
        run_options=run_options
    )


def get_used_megabytes():
    return psutil.Process().memory_full_info().uss / (1024 * 1024)


def run_model_many_times(onnx_model, count):
    print("Memory in the beginning = ", get_used_megabytes())

    for i in range(count):
        input_data = numpy.array([str(uuid.uuid4()) for _ in range(40)])

        get_one_prediction(onnx_model, input_data)

        if i % 10000 == 0:
            print("Memory before shrinkage = ", get_used_megabytes())
            get_one_prediction(onnx_model, input_data, True)
            print("Memory after shrinkage = ", get_used_megabytes())

    print("Memory in the end = ", get_used_megabytes())


def main():
    print("Memory before model's creation =  ", get_used_megabytes())
    onnx_model = create_trained_model()
    count = 100000
    run_model_many_times(onnx_model, count)

if __name__ == '__main__':
    main()

This code produce the following output:

Memory before model's creation = 98.734375
Memory in the beginning = 104.390625
Memory before shrinkage = 104.46875
Memory after shrinkage = 104.46875
Memory before shrinkage = 104.859375
Memory after shrinkage = 104.859375
Memory before shrinkage = 104.859375
Memory after shrinkage = 104.859375
Memory before shrinkage = 104.90625
Memory after shrinkage = 104.90625
Memory before shrinkage = 104.9375
Memory after shrinkage = 104.9375
Memory before shrinkage = 104.953125
Memory after shrinkage = 104.953125
Memory before shrinkage = 104.953125
Memory after shrinkage = 104.953125
Memory before shrinkage = 104.984375
Memory after shrinkage = 104.984375
Memory before shrinkage = 105.109375
Memory after shrinkage = 105.109375
Memory before shrinkage = 105.125
Memory after shrinkage = 105.125
Memory in the end = 105.140625

Urgency

No response

Platform

Linux

OS Version

Ubuntu 20.04.6 LTS

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.20.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

yuslepukhin · 2025-01-13T19:02:33Z

This feature was primarily introduced for GPU memory.
For CPU we recommend disabling the arena all together and see if default allocator does a better job (it often does).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory.enable_memory_arena_shrinkage is not working in python #23339

memory.enable_memory_arena_shrinkage is not working in python #23339

vsbaldeev commented Jan 13, 2025

yuslepukhin commented Jan 13, 2025

memory.enable_memory_arena_shrinkage is not working in python #23339

memory.enable_memory_arena_shrinkage is not working in python #23339

Comments

vsbaldeev commented Jan 13, 2025

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

yuslepukhin commented Jan 13, 2025