Mobile incidents · How we classify

Crashlytics tells you what broke.
Launch Vectors tells you why.

Four mobile incidents — and what a causality engine should surface in the first 15 minutes. Every existing tool says a crash happened. None of them tell you whether your release caused it, whether to roll back, or what to do next.

Classification, not just detection Rollback / no-rollback recommendation Honest about confidence

Mostly illustrative. Cards #1–#3 show the shape of what the Launch Vectors causality engine will surface for design-partner pilots — the engine is in the L3 build, the numbers in those three are hand-crafted to demonstrate the format. Card #1 (Android System WebView, March 2021) is a real historical incident with a post-hoc classification mock-up; #2 and #3 are framed as recurring industry patterns rather than specific incidents.

Card #4 is live engine output. The Firefox Android card is generated by a small open-source pipeline ( scripts/firefox-causality-demo) that queries Mozilla's public Crash Stats SuperSearch API and Bugzilla REST API. The signature, hit counts, version distribution, and Bugzilla cross-references in that card are real — re-run the script and the numbers update.

Android System WebView crash spike

com.customer.app · Android 10 / 11 · Samsung & Pixel cohort

Critical Platform regression2021-03-22

Public incident — Android System WebView / Chrome 89 broke embedded WebView across Gmail, Yahoo Mail, Amazon, Discord, Outlook, banking apps. Coverage

C

What Crashlytics / Sentry shows

Fatal Exception: java.lang.RuntimeException
  at android.webkit.WebView.<init>(WebView.java:574)
  at com.android.webview.chromium.WebViewChromiumFactoryProvider…
  at com.customer.app.MainActivity.onCreate(MainActivity.kt:42)
Process
com.customer.app
Device
Samsung / Pixel
OS
Android 10 / 11
Affected sessions
+38× vs baseline

A crash happened. You go investigate.

LV

Launch Vectors

confidence94%

Classification

External dependency regression

Customer app release correlation
LOW
Recent app PR correlation
LOW
Common dependency
Android System WebView · Chrome 89.0.4389.x
Cohort
Apps embedding WebView, all releases
Likely cause
Platform-component regression — not your code

Recommendation

Do NOT roll back your release

  • Disable WebView-dependent feature via remote config
  • Show native fallback screen if WebView init fails
  • Guide users to update Android System WebView and Chrome
  • Add a startup guard around WebView creation

Takeaway: Hours of senior triage avoided. A rollback would have shipped an unchanged binary into the same WebView bug.

Time saved: ~6h senior triage

Crash rate spike traced to a third-party SDK

com.customer.shopping · Release 8.4.1 · iOS + Android

High Third-party SDK regressionRecurring industry pattern

Pattern observed multiple times across mobile teams using major analytics / login SDKs.

C

What Crashlytics / Sentry shows

java.lang.NullPointerException
  at com.facebook.appevents.AppEventsLogger…
  at com.facebook.internal.Utility.coerceValueIfNullOrEmpty…
  at com.customer.shopping.checkout.CheckoutActivity.onCreate(CheckoutActivity.kt:118)
Process
com.customer.shopping
Crash rate
0.2% → 3.1%
Sessions affected
+15×
Stack
CheckoutActivity / Facebook SDK frames

A crash happened. You go investigate.

LV

Launch Vectors

confidence94%

Classification

Third-party SDK regression

Crash-cluster co-occurrence
100% contain Facebook SDK symbols
Release 8.4.1 correlation
LOW
Backend API changes
None in window
Feature flag changes
None in window
Common dependency
Facebook SDK 15.2

Recommendation

Pin SDK or disable the Facebook-login path — do not roll back the app

  • Pin Facebook SDK to 15.1 in your build (force-downgrade)
  • Disable Facebook-login pathway via remote flag for affected cohort
  • Open ticket with vendor; subscribe to their status page
  • Add SDK-version pin to release notes so future builds don't regress

Takeaway: Internal-code blame avoided. The app release is fine — the SDK shipped a bad update.

Time saved: ~4h senior triage

iOS crash from a backend response schema change

com.customer.commerce · iOS 17 · last mobile release 4 days ago

Medium Backend contract changeRecurring industry pattern

Tested with synthetic backend deploy events. Detected from the temporal correlation between mobile crash cluster onset and backend deploy timestamp.

C

What Crashlytics / Sentry shows

Fatal error: Unexpectedly found nil while unwrapping an Optional value
  ProductDetailViewModel.swift:84
  ProductDetailViewController.viewWillAppear(_:)
  UIKit symbol frames…
Process
com.customer.commerce
Endpoint touched at crash time
GET /product/details
Cluster onset
2:13 PM
Affected platforms
iOS only

A crash happened. You go investigate.

LV

Launch Vectors

confidence92%

Classification

Backend contract change

New crash cluster appeared
2:13 PM
Last mobile release
4 days ago — not in window
Last backend deploy
17 minutes ago
Affected endpoint
GET /product/details
Detected response delta
price field now nullable

Recommendation

Contact backend team — no mobile rollback required

  • Page the backend on-call: schema regression on /product/details
  • Hot-fix backend to keep `price` non-nullable (or send 0 fallback)
  • Add nil-guard on iOS for the next release
  • Add a contract test to your shared API spec so this can't reach prod again

Takeaway: The mobile team didn't ship the bug. Without correlation, this turns into a multi-hour blame loop.

Time saved: ~3h senior triage

Firefox Android — live triage from public telemetry

org.mozilla.firefox · Firefox Android (Fenix) · top affected: 153.0a1 · Android 30

Live engine outputMedium Open-telemetry triageGenerated 2026-06-01 · public data

Live engine output. Crash signature, hit-count trend, version + OS distribution all pulled from Mozilla's public Crash Stats SuperSearch API in the last 28 days. Component attribution and per-component recommendation actions are a hand-curated rule set in scripts/firefox-causality-demo/run.py — see the README for what's heuristic and what's real. Crash Stats signature

C

What Crashlytics / Sentry shows

// Crash Stats signature view (public)
Signature: libc.so | core::ptr::drop_in_place<T> | core::ptr::drop_in_place<T> | <glean_sym::types::TimerId as uniffi_core::ffi_…
// Hit count (28d): 45,352
// Trend: +0.0% vs prior 14-day half
Build
153.0a1
Hit count (28d)
45,352
Trend
+0.0% vs prior 14-day half
Top OS
Android 30

A crash happened. You go investigate.

LV

Launch Vectors

confidence70%

Classification

Glean SDK (uniffi binding layer) regression (rule-based attribution)

Component
Glean SDK (uniffi binding layer)
Failure mode
Memory-unsafety pattern in the UniFFI binding `try_lift` path — typically a use-after-free or invalid pointer during cross-language type conversion.
Affected versions
153.0a1 (45,352)
Affected OS
Android 30 (12,077), Android 36 (7,862), Android 31 (6,401)
Related Bugzilla bugs
bug 2043773 (ASSIGNED), bug 1675653 (RESOLVED), bug 1747382 (RESOLVED)

Recommendation

Pin glean-core to a known-good version, suppress the failing metric, and filter the signature from alert routing.

  • Pin `glean-core` (Rust) to the latest patch release of the previous minor. In Fenix this is consumed via `application-services`; bump the override in `gradle/libs.versions.toml` under `glean` → next-newest stable.
  • Wrap the calling metric (`TimingDistributionMetric.start()` based on the signature) in a try/catch at the recording site so a single bad sample doesn't propagate. Reference: `mozilla.telemetry.glean.private.TimingDistributionMetricType`.
  • Add the signature to your Crashlytics/Sentry inbound-alert filter as a known-upstream issue while the pin is in place — prevents on-call paging on a non-actionable third-party crash.
  • Subscribe to the Bugzilla bugs linked below; cherry-pick the fix patch into your `glean-core` override the day it lands in `mozilla-central`.
  • Open a tracking issue in your own repo against the next release, blocking rollout until `glean-core` is upgraded past the pinned version.

Takeaway: Real production evidence + rule-based attribution + per-component actionable recommendations. The next pipeline upgrade replaces the rule set with static-analysis output on the Glean / Gecko / Necko source code and LLM-generated patches verified against the binary symbol table — taking confidence from ~70% toward >90% and the recommendations from "pin and suppress" toward "here's the diff".

Time saved: ~5h senior triage
On the roadmap

The harder incidents we're building toward.

These need infrastructure we don't ship in v1 — heap profilers, cohort detectors, sentiment ingest from App Store reviews. Design partners get them as we build them.

Memory-leak attribution

Closure-level retain-cycle traceback that points to the offending PR and the `weak self` fix.

Cohort-specific incidents

"Samsung S23 / Android 15 / CameraX" — surfaced even when the global crash rate looks healthy.

Incident-commander view

Synthesizes crashes + ANRs + App Store reviews + support tickets into one root-cause summary.

Get this for your mobile team.

Design partners get the causality engine deployed on their real telemetry — crashes, releases, PRs, backend deploys, feature flags, all correlated.