Skip to content

Question about 0.5 pseudo-count #75

@godotgildor

Description

@godotgildor

Looking at the code here:

shared_counts = (
self.store.select(
"/main/{}/counts".format(label), "columns in ['c_0', c_last]"
)
.sum(axis="index")
.values
+ 0.5
)

I notice that for the denominator of the enrichment ratio (shared_counts), the code sums all the values for c_0 and c_last and then adds a single pseudo-count of 0.5. Later, for the numerator, the code adds a pseudo-count of 0.5 to each count in the numerator.

Wouldn't this have the effect of potentially skewing the ratios so that they wouldn't sum to 1? For instance, let's say c_0 = [1, 3, 1, 2]

Then shared_counts = (1 + 3 + 1 + 2) + 0.5 = 7.5

And then the ratios would be:
1.5/7.5 = 0.2
3.5/7.5 = 0.467
1.5/7.5 = 0.2
2.5/7.5 = 0.333

and the sum of the ratios = 1.2 instead of 1.

I would have thought that for shared_counts the code would have added the 0.5 pseudo count prior to the sum (or alternatively added a pseudo_count = 0.5*len(c_0)) ?

I may be misreading things though.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions