-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ourlogs): Allow log ingestion behind a flag #4448
base: master
Are you sure you want to change the base?
Conversation
This adds log ingestion (currently only working for the OTel log format) behind a feature flag 'organizations:ourlogs-ingestion'. This PR aims to be the minimum possible to support local and test-org ingestion before we move to dogfooding. Other notes: - We need to add two DataCategories because we need to track quantity (for current discarded breadcrumb ClientOutcome tracking) and also bytes for total log bytes ingested, which is one of the quota recommendations. - Eventually we will convert Breadcrumbs into logs as well, very similar to span extraction for spans on the event. How exactly that will work is still being discussed with product and sdk folks. - The name 'ourlogs' is an internal name to disambiguate between 'our log product' logs and internally created logs. User facing strings will be set to 'Log' to avoid exposing implementation details.
dc3eba0
to
3f6f6f9
Compare
pub struct OurLog { | ||
/// Time when the event occurred. | ||
#[metastructure(required = true, trim = false)] | ||
pub timestamp_nanos: Annotated<u64>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other data types accept both unix timestamps and formatted date strings, although they do not have nanosecond precision:
relay/relay-event-schema/src/protocol/span.rs
Lines 16 to 18 in 4622914
/// Timestamp when the span was ended. | |
#[metastructure(required = true, trim = false)] | |
pub timestamp: Annotated<Timestamp>, |
Are nanos required for logs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTel defines them as nanos, the consumers will consume nanos, and they are stored as nanos, we may need breadcrumbs to switch from floats to nanos but otherwise I think we are leaning towards keeping it the same format throughout instead of having slightly different intermediate formats.
This is pulling out data categories from #4448 as their own PR.
Co-authored-by: Joris Bayer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! It would be nice to have at least one integration test, see for example
relay/tests/integration/test_spans.py
Line 472 in bbe569d
def test_span_ingestion( |
// We need to track the count and bytes separately for possible rate limits and quotas on both counts and bytes. | ||
self.outcome_aggregator.send(TrackOutcome { | ||
category: DataCategory::LogItem, | ||
event_id: None, | ||
outcome: Outcome::Accepted, | ||
quantity: 1, | ||
remote_addr: None, | ||
scoping, | ||
timestamp: received_at, | ||
}); | ||
self.outcome_aggregator.send(TrackOutcome { | ||
category: DataCategory::LogByte, | ||
event_id: None, | ||
outcome: Outcome::Accepted, | ||
quantity: payload_len as u32, | ||
remote_addr: None, | ||
scoping, | ||
timestamp: received_at, | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of producing "accepted" outcomes in Relay. Ideally we would produce them as close to storage as possible, e.g. after storing the data in Snuba, so that we do not overcount items when data gets lost between Relay and storage. For spans, we moved the "accepted" outcomes to Relay because there was simply no consumer in sentry that could do it, and there was no precedence for creating outcomes in Snuba itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has that changed now? Is there prior art to doing it in a snuba consumer etc.? I was actually thinking about that while adding them here because of the possibility of rejecting the item later on.
Co-authored-by: Joris Bayer <[email protected]>
Co-authored-by: Joris Bayer <[email protected]>
Co-authored-by: Joris Bayer <[email protected]>
Co-authored-by: Joris Bayer <[email protected]>
### Summary This is pulling out data categories from #4448 as their own PR. --------- Co-authored-by: Joris Bayer <[email protected]>
This adds log ingestion (currently only working for the OTel log format) behind a feature flag 'organizations:ourlogs-ingestion'.
This PR aims to be the minimum possible to support local and test-org ingestion before we move to dogfooding.
Other notes:
DataCategory
s because we need to track quantity (for current discarded breadcrumb client outcome tracking) and also bytes for total log bytes ingested, which is one of the quota recommendations.ourlogs
is an internal name to disambiguate between 'our log product' logs (corny, I know) and internally created logs. User facing strings will be set to 'Log' to avoid exposing implementation details.