# modal.concurrent

```python
def concurrent(
    *,
    max_inputs: Optional[int] = None,  # Hard limit on each container's input concurrency
    target_inputs: Optional[int] = None,  # Input concurrency that Modal's autoscaler should target
) -> Callable[
    [Union[Callable[P, ReturnType], _PartialFunction[P, ReturnType, ReturnType]]],
    _PartialFunction[P, ReturnType, ReturnType],
]:
```

Decorator that allows individual containers to handle multiple inputs concurrently.

The concurrency mechanism depends on whether the function is async or not:
- Async functions will run inputs on a single thread as asyncio tasks.
- Synchronous functions will use multi-threading. The code must be thread-safe.

Input concurrency will be most useful for workflows that are IO-bound
(e.g., making network requests) or when running an inference server that supports
dynamic batching.

When `target_inputs` is set, Modal's autoscaler will try to provision resources
such that each container is running that many inputs concurrently, rather than
autoscaling based on `max_inputs`. Containers may burst up to up to `max_inputs`
if resources are insufficient to remain at the target concurrency, e.g. when the
arrival rate of inputs increases. This can trade-off a small increase in average
latency to avoid larger tail latencies from input queuing.

**Examples:**
```python
# Stack the decorator under `@app.function()` to enable input concurrency
@app.function()
@modal.concurrent(max_inputs=100)
async def f(data):
    # Async function; will be scheduled as asyncio task
    ...

# With `@app.cls()`, apply the decorator at the class level, not on individual methods
@app.cls()
@modal.concurrent(max_inputs=100, target_inputs=80)
class C:
    @modal.method()
    def f(self, data):
        # Sync function; must be thread-safe
        ...

```

*Added in v0.73.148:* This decorator replaces the `allow_concurrent_inputs` parameter
in `@app.function()` and `@app.cls()`.
