(feat): per-IOC locking#165
Open
anderslindho wants to merge 6 commits into
Open
Conversation
Previously abort() returned the error on CancelledError and raised a new one on other failures, leaving the chain errored so any queued commit (notably the disconnect transaction) was never executed — IOCs could stay Active in CF after an upload failure. Co-authored-by: Sky Brewer <sky.brewer@ess.eu>
8fc2db4 to
85ed5e4
Compare
simon-ess
reviewed
May 18, 2026
jacomago
reviewed
May 18, 2026
Contributor
jacomago
left a comment
There was a problem hiding this comment.
I'm wondering if we can get better testing for this.
I'm thinking we could make the multi-ioc container tests could have maxActive = 2, since there are 4 iocs.
Would also be nice to have an IOC with more channels than the commitMax which is defaulted to 5000, could simply set it to 1 and make sure tests work?
The single global DeferredLock serialised all IOC commits behind one queue. Under load, the lock depth determined how long IOCs waited to be marked active or inactive, and one stuck commit blocked every other. Per-IOC locks let different IOCs commit in parallel while keeping same-IOC transactions serialised. _state_lock (threading.Lock) guards the shared iocs/channel_ioc_ids dicts against concurrent thread writes. The CF push receives a targeted snapshot of the channels relevant to this commit (records being deleted plus the IOC's current channel set) so snapshot cost scales with the commit rather than total channel count. _ioc_channels tracks which channels belong to each IOC for O(1) disconnect cost; registration deduplicates to prevent double-counting. stopService drains all in-flight per-IOC locks before running the clean_on_stop sweep, preventing Active pushes from racing the sweep. The commit path moves to @inlineCallbacks so retry waits use task.deferLater — no sleeping worker threads between attempts. Also introduces CFUpdateAbortedError to distinguish exhausted CF retries from cancellation, and fixes chain_error to surface unexpected exceptions rather than swallowing them. Co-authored-by: Sky Brewer <sky.brewer@ess.eu>
deferToThread is monkeypatched to return synchronously-resolved Deferreds so the full _commit_with_lock callback chain can be exercised without a reactor. The prepare_result tuple matches the (ioc_info, record_info_by_name, records_to_delete, iocs_snap, ciids_snap) signature of _prepare_commit; the push phase is controlled via _push_to_cf_async. Covers lock identity, iocid routing, prune lifecycle, and all four chain_error paths. Co-authored-by: Sky Brewer <sky.brewer@ess.eu>
The global lock is gone; update demo.conf to reflect that a stuck pushAlwaysRetry now only holds that IOC's lock rather than blocking all commits, fix an incorrect default value comment, and correct a long-standing typo in the same pass. Also rename single-letter CastFactory identifiers (P/P2 → proto/waiting, addr → _addr) to satisfy linter, and correct the _stop_service_with_lock docstring which still claimed the lifecycle lock prevents concurrent commits (commits now use per-IOC locks and are drained explicitly before the clean_on_stop sweep).
application.py unconditionally overwrote session.trlimit with the config value, which defaulted to 0 (no limit), silently negating the trlimit=5000 class default added in 89be20c. Reference CollectionSession.trlimit directly so the default cannot silently drift if the class default ever changes, and add a regression test.
538b445 to
c51af3e
Compare
Twisted's default thread pool size is 10. With maxActive IOC commits in-flight, each holding a pool thread for the CF HTTP call, the pool becomes the bottleneck before maxActive connections are ever fully utilised. Set the pool to maxActive + 10 (headroom for clean_service and other thread users) so throughput scales with the configured connection limit rather than the thread pool default.
c51af3e to
d35922d
Compare
|
jacomago
reviewed
May 19, 2026
Contributor
jacomago
left a comment
There was a problem hiding this comment.
Still think we should have more integration tests where the ioc count is purposefully above the configuration maxActive etc. Maybe even just alter the configuraiton for the multi-ioc tests to have maxActive =1 .
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



No description provided.