[09:11:10] 10Analytics, 10Wikidata, 10User-Ladsgroup, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Run hadoop analysis on wb_terms migration for entities below 29 million to check state - https://phabricator.wikimedia.org/T243763 (10Addshore) 05Open→03Resolved [09:28:16] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad: Degraded RAID on analytics1030 - https://phabricator.wikimedia.org/T243971 (10Peachey88) [14:50:39] 10Analytics, 10Analytics-Cluster, 10Operations: notebook1004 - /srv is full - https://phabricator.wikimedia.org/T232068 (10jijiki) 05Resolved→03Open [14:51:55] 10Analytics, 10Analytics-Cluster, 10Operations: notebook1004 - /srv is full - https://phabricator.wikimedia.org/T232068 (10jijiki) Host alerted again about /srv being full, /srv/home is 119G. [15:32:29] Hi, sorry to bug you, wasn't sure if this was a monitored channel. But I did want to ask: are the non-summarized clickstream data available after 2018 [15:33:27] I have a project that could make use of the external referrer counts for December 2019 (and potentially ongoing in 2020 if the data were available from which you build the nice categorical summaries). [15:37:58] a-team? [15:48:00] jhcowie: I’ll take a look [15:50:04] jhcowie: so do you mean this is summarized: https://dumps.wikimedia.org/other/clickstream/ and you need non-summarized? Do you have an example of non-summarized then? [16:10:04] 10Analytics, 10Analytics-Cluster, 10Operations: notebook1004 - /srv is full - https://phabricator.wikimedia.org/T232068 (10elukey) @Groceryheist hello, can you check your home directory size ? :) [16:30:44] 10Analytics, 10WMDE-Analytics-Engineering, 10Privacy, 10User-GoranSMilovanovic: Public data set review for T237728 - https://phabricator.wikimedia.org/T239393 (10Nuria) >(which are - correct me if I am wrong - considered as private data under the WMF policy) I think you are mistaken editors and readers da... [17:00:32] milimetric: exactly right. Let me see if I can find an example of the nonsummarized (e.g., with real external URLs referring to the various pages) [17:01:03] 10Analytics, 10Wikidata, 10Wikidata-Campsite: Long query running on dbstore1005:3318 - https://phabricator.wikimedia.org/T243871 (10Marostegui) After the query was killed no more replication lag has happened {F31542138} [17:13:01] milimetric: my apologies, I cannot find an example of unsummarized (thought I had one; looks like it's just early versions of the summarized product you currently publish). [17:14:22] There is a media data API that sounds like it allows retrieval of the referer URLs for a wikimedia media object in a specified time range, but not one for the pages themselves. [17:17:31] If wishes were free, I'd like to find an API that lets me submit a page (e.g., "Bicycle") and a date (e.g., "2019-12-15") and get the list of referer URLs, with counts, that you saw in the request logs on that date. [17:18:46] I think this must exist (if only transiently) before being boiled down to a single equivalence class ("external") for the summarized product. [17:23:33] jhcowie: ah, I see, yeah, we delete the raw referers after 60 days and don’t make them available publicly in any way. The raw urls are too sensitive for public release [17:24:23] ok, I can understand that call. Thanks. [18:16:14] PROBLEM - Webrequests Varnishkafka log producer on cp4029 is CRITICAL: PROCS CRITICAL: 0 processes with args /usr/bin/varnishkafka -S /etc/varnishkafka/webrequest.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [18:16:52] PROBLEM - statsv Varnishkafka log producer on cp4029 is CRITICAL: PROCS CRITICAL: 0 processes with args /usr/bin/varnishkafka -S /etc/varnishkafka/statsv.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [18:17:36] PROBLEM - eventlogging Varnishkafka log producer on cp4029 is CRITICAL: PROCS CRITICAL: 0 processes with args /usr/bin/varnishkafka -S /etc/varnishkafka/eventlogging.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [18:18:04] RECOVERY - Webrequests Varnishkafka log producer on cp4029 is OK: PROCS OK: 1 process with args /usr/bin/varnishkafka -S /etc/varnishkafka/webrequest.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [18:18:42] RECOVERY - statsv Varnishkafka log producer on cp4029 is OK: PROCS OK: 1 process with args /usr/bin/varnishkafka -S /etc/varnishkafka/statsv.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [18:19:28] RECOVERY - eventlogging Varnishkafka log producer on cp4029 is OK: PROCS OK: 1 process with args /usr/bin/varnishkafka -S /etc/varnishkafka/eventlogging.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka