[00:06:55] Analytics-Tech-community-metrics, Developer-Relations, DevRel-October-2015: Check whether it is true that we have lost 40% of (Git) code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#1772404 (Aklapper) >>! In T103292#1762313 on Oct 28, @Aklapper wrote: > merged into htt... [00:07:14] Analytics-Tech-community-metrics, Developer-Relations, DevRel-November-2015, DevRel-October-2015: Check whether it is true that we have lost 40% of (Git) code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#1772406 (Aklapper) [00:09:17] Analytics-Tech-community-metrics, DevRel-November-2015: Automated generation of (Git) repositories for Korma - https://phabricator.wikimedia.org/T110678#1772407 (Aklapper) @Dicortazar: Any feedback to T110678#1745829 ? [00:09:52] Analytics-Tech-community-metrics, DevRel-November-2015: Backlogs of open changesets by affiliation - https://phabricator.wikimedia.org/T113719#1772410 (Aklapper) [00:10:33] Analytics-Tech-community-metrics, DevRel-November-2015: "Age of open changesets by Affiliation" has some "NaN" values - https://phabricator.wikimedia.org/T110875#1772413 (Aklapper) [00:10:45] Analytics-Tech-community-metrics, DevRel-November-2015: Affiliations and country of resident should be visible in Korma's user profiles - https://phabricator.wikimedia.org/T112528#1772414 (Aklapper) [00:12:52] Analytics-Tech-community-metrics, DevRel-November-2015: "Tickets" (defunct Bugzilla) vs "Maniphest" sections on korma are confusing - https://phabricator.wikimedia.org/T106037#1772420 (Aklapper) [00:13:23] Analytics-Tech-community-metrics, DevRel-November-2015: Correct affiliation for code review contributors of the past 30 days - https://phabricator.wikimedia.org/T112527#1772423 (Aklapper) [00:41:34] Analytics-Tech-community-metrics, DevRel-November-2015: Legend for "review time for reviewers" and other strings on repository.html - https://phabricator.wikimedia.org/T103469#1772440 (Aklapper) a:Dicortazar @Dicortazar: Need feedback on the last comment [07:10:51] Analytics-Wikistats: Cross-link stats.wikimedia.org and ee-dashboard.wmflabs.org - https://phabricator.wikimedia.org/T67994#1772675 (Liuxinyu970226) [07:15:38] Analytics: Update reportcard.wmflabs.org with September data - https://phabricator.wikimedia.org/T116244#1772678 (Liuxinyu970226) [07:27:08] Analytics-Tech-community-metrics, DevRel-November-2015: Tech community KPIs for the WMF metrics meeting - https://phabricator.wikimedia.org/T107562#1772691 (Qgil) [09:08:26] (PS3) Addshore: adds bulk sparql query and output scripts [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/248033 (owner: Christopher Johnson (WMDE)) [09:11:41] (CR) Addshore: [C: -1] "Also you may have to use a webproxy from the cluster." (2 comments) [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/248033 (owner: Christopher Johnson (WMDE)) [09:12:30] Ironholds: what are your normal working times then? :) And what time zone again? ;) [09:38:20] Analytics, Analytics-Kanban, Patch-For-Review: View counts in squid logs, webstatscollector 2.0 and hive are very dissimilar for several projects. [5 pts] - https://phabricator.wikimedia.org/T116609#1772830 (Nemo_bis) The current `$wgNoticeHideUrls` also explains why Wiktionary and Wikivoyage pageviews... [10:14:19] Analytics, Analytics-Kanban, Patch-For-Review: View counts in squid logs, webstatscollector 2.0 and hive are very dissimilar for several projects. [5 pts] - https://phabricator.wikimedia.org/T116609#1772888 (Ironholds) The new pageview definition excludes HideBanners based on the MIME type (we should p... [10:25:53] (PS1) OliverKeyes: Expand the prohibited uri_paths in the Pageview definition. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/250389 (https://phabricator.wikimedia.org/T117345) [10:26:01] Analytics-Backlog, Patch-For-Review: Exclude MobileMenu from Pageviews - https://phabricator.wikimedia.org/T117345#1772926 (Ironholds) a:Ironholds [10:28:23] addshore, EST, midday to 7pm or so [10:28:32] as to why I'm telling you this now: I have insomnia ;p [10:28:52] HAH [10:29:07] in my defence I just submitted a one-line patch that fixes two different bugs [10:29:12] so at least I'm using my time productively [10:29:59] yay, 5 am ;) [10:36:19] Thanks Ironholds for the comment and patch on PageviewDef: ) [10:36:30] And now GET BACK TO SLEEP :) [10:57:51] joal, yessir! [11:28:24] Analytics-Cluster, Database: Replicate Echo tables to analytics-store - https://phabricator.wikimedia.org/T115275#1773035 (jcrespo) So here it is the thing: Replicating just flowdb takes ~2 QPS and very few MBs. This has worked well for T75047. Replicating the echo tables requires >120 GB and double the... [11:47:57] Analytics-Wikistats: Cross-link stats.wikimedia.org and ee-dashboard.wmflabs.org - https://phabricator.wikimedia.org/T67994#1773128 (ezachte) What's the best way to detect which language codes have new stats? (other than screen scraping https://meta.wikimedia.org/wiki/Research:VisualEditor) [11:52:08] Analytics, Continuous-Integration-Config, WMDE-Analytics-Engineering, Wikidata: Add basic jenkins linting to analytics-limn-wikidata-data - https://phabricator.wikimedia.org/T116007#1773147 (Addshore) [12:14:42] Analytics, Continuous-Integration-Config, WMDE-Analytics-Engineering, Wikidata, Patch-For-Review: Add basic jenkins linting to analytics-limn-wikidata-data - https://phabricator.wikimedia.org/T116007#1773176 (Addshore) a:JanZerebecki [12:45:05] * addshore goes to look at dumping some rdf into hadoop [13:13:53] Analytics, Continuous-Integration-Config, WMDE-Analytics-Engineering, Wikidata, and 2 others: Add basic jenkins linting to analytics-limn-wikidata-data - https://phabricator.wikimedia.org/T116007#1773275 (JanZerebecki) [13:54:26] (CR) JanZerebecki: "recheck" [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/245468 (owner: Addshore) [13:54:51] (CR) Addshore: ":)" [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/245468 (owner: Addshore) [13:55:13] Analytics, Continuous-Integration-Config, WMDE-Analytics-Engineering, Wikidata, and 2 others: Add basic jenkins linting to analytics-limn-wikidata-data - https://phabricator.wikimedia.org/T116007#1773442 (Addshore) Open>Resolved [13:59:18] Analytics, Continuous-Integration-Config: add CI for repos analytics/limn-*-data - https://phabricator.wikimedia.org/T117416#1773454 (JanZerebecki) NEW [14:02:13] Analytics, Continuous-Integration-Config, WMDE-Analytics-Engineering, Wikidata, and 2 others: Add basic jenkins linting to analytics-limn-wikidata-data - https://phabricator.wikimedia.org/T116007#1773474 (JanZerebecki) >>! In T116007#1757055, @hashar wrote: > Don't we want jobs for all the limn da... [14:30:46] Analytics-Kanban, Analytics-Wikistats, Patch-For-Review: Feed Wikistats traffic reports with aggregated hive data {lama} [21 pts] - https://phabricator.wikimedia.org/T114379#1773518 (Milimetric) @ezachte, before we backfill pageviews-* data back to May, I just want to double check, if that's useful to... [14:31:36] Analytics-Wikistats: wrong total number of articles for Tyv.wikipedia.org - https://phabricator.wikimedia.org/T56681#1773528 (Aklapper) p:Triage>Lowest [14:32:32] Analytics-Wikistats: Vertically Reversed Quarter Data - https://phabricator.wikimedia.org/T57621#1773533 (Aklapper) p:Triage>Lowest [14:32:40] Analytics-Visualization: GeoIP updates can users to jump to new country in geowiki files - https://phabricator.wikimedia.org/T56650#1773542 (Aklapper) p:Triage>Lowest [14:32:44] Analytics-Visualization: Clarify in graphs that users might be in both global north, and global south - https://phabricator.wikimedia.org/T56649#1773545 (Aklapper) p:Triage>Lowest [14:32:46] Analytics-Visualization: Column changes in global-dev/dashboard-data files breaks user-generated graphs - https://phabricator.wikimedia.org/T56612#1773546 (Aklapper) p:Triage>Lowest [14:32:48] Analytics-Visualization: Column changes in geowiki-data files breaks user-generated graphs - https://phabricator.wikimedia.org/T56611#1773549 (Aklapper) p:Triage>Lowest [14:45:16] ottomata: I was looking at playing around with importing the rdf or json dumps of wikidata into hive / hadoop and just came across http://ottomata.org/tech/too-many-hive-json-serdes/ :P any tips? ;P [14:46:19] addshore: ha :) [14:46:31] those tips mostly are still valid, although the hcatalog auxpath has probably changed [14:46:39] look into the create table statement of wmf_raw.webrequest [14:46:55] that is a snappy compressed sequence file of json data [14:47:05] addshore: where is this data currently stored? [14:47:12] and, how big is it? [14:47:37] I played around and firstly tried LOAD DATA and having a table with 1 col (which is the json object) then using json_tuple in queries, but of course thats slow as there is no way to partition, and well, its just a bunch of strings really... [14:48:04] Not actually played with SerDes yet, as that seems like a bit of work :P [14:48:17] Well, LOAD DATA LOCAL INPATH '/mnt/data/xmldatadumps/public/wikidatawiki/entities/20151102/wikidata-20151102-all.json.gz' OVERWRITE INTO TABLE test1; [14:48:27] I was trying with that json dump initially [14:48:52] although my initial thoughts were to use the RDF dumps at http://tools.wmflabs.org/wikidata-exports/rdf/index.php?content=exports.php [14:51:48] ottomata: where is the create for wmf_raw.webrequest? [14:51:59] show create table wmf_raw.webrequest; [14:52:00] or [14:52:11] https://github.com/wikimedia/analytics-refinery/blob/master/hive/webrequest/create_webrequest_raw_table.hql [14:52:14] FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: MetaException(message:java.lang.ClassNotFoundException Class org.apache.hive.hcatalog.data.JsonSerDe not found) [14:52:25] aye [14:53:13] https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hive/Queries#JsonSerDe_Errors [14:53:17] ADD JAR /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jarĀ ; [14:54:13] ahh! [14:54:51] I wonder if the rdf would be easier, it would definitely mean it could be partitioned easier. [14:55:52] how about an easy way to simply import an RDF dump into hive? a table with 3 columns? ;) [14:57:08] (CR) Nuria: Expand the prohibited uri_paths in the Pageview definition. (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/250389 (https://phabricator.wikimedia.org/T117345) (owner: OliverKeyes) [14:57:34] (PS7) Nuria: Functions for identifying search engines as referers. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/247601 (https://phabricator.wikimedia.org/T115919) (owner: OliverKeyes) [14:59:49] Analytics-Kanban, Analytics-Wikistats, Patch-For-Review: Feed Wikistats traffic reports with aggregated hive data {lama} [21 pts] - https://phabricator.wikimedia.org/T114379#1773661 (Milimetric) @ezachte: I think I got to the bottom of the 4.8% difference between WC 3.0 and wmf.pageview_hourly. Basica... [14:59:52] Analytics-Backlog, Analytics-EventLogging, Analytics-Kanban: More solid Eventlogging alarms for raw/validated - https://phabricator.wikimedia.org/T116035#1773662 (Nuria) p:Triage>High [15:00:08] addshore: i have never used RDFs, soooo can't guide you there [15:00:09] but, probably! [15:00:19] :D [15:00:30] cool, I'll just keep poking ;) [15:02:57] Analytics-Kanban, Analytics-Wikistats, Patch-For-Review: Feed Wikistats traffic reports with aggregated hive data {lama} [21 pts] - https://phabricator.wikimedia.org/T114379#1773690 (ezachte) @Milimetric, projectviews are indeed all I need for this process (someday when I upgrade daily&monthly aggrega... [15:05:12] ottomata: you mind helping Erik with access rsync-ing to dataset1001? https://phabricator.wikimedia.org/T114379#1773690 [15:07:24] milimetric: yes will, trying to fix broken server atm... [15:09:18] ok, no rush [15:12:01] Analytics-Kanban: Pageview API showcase App {slug} - https://phabricator.wikimedia.org/T117224#1773737 (Milimetric) a:mforns [15:12:32] Analytics-Kanban: Reformat pageview API responses to allow for status reports and messages {slug} - https://phabricator.wikimedia.org/T117017#1773742 (Milimetric) a:Milimetric [15:42:49] (CR) OliverKeyes: Expand the prohibited uri_paths in the Pageview definition. (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/250389 (https://phabricator.wikimedia.org/T117345) (owner: OliverKeyes) [15:45:59] holaaa [15:52:59] (CR) Nuria: [C: 2 V: 2] "Merging. I leave documentation of changes up to you, right?" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/250389 (https://phabricator.wikimedia.org/T117345) (owner: OliverKeyes) [15:59:18] (CR) OliverKeyes: "Sounds good; I'll throw a set up on the wiki entry." [analytics/refinery/source] - https://gerrit.wikimedia.org/r/250389 (https://phabricator.wikimedia.org/T117345) (owner: OliverKeyes) [16:02:27] (PS21) Joal: Add cassandra load job for pageview API [analytics/refinery] - https://gerrit.wikimedia.org/r/236224 [16:13:24] milimetric: Hi ! [16:17:32] hiya llll, milimetric or nuria, need brain bounce about EL stuff [16:17:49] ottomata: sure, we can catch up with services conversation too [16:17:52] js [16:17:53] ja [16:17:54] ottomata: the internet here's terrible, you can call my phone from the hangout [16:17:54] ok batcave! [16:18:08] ok milimetric [16:18:23] nuria: same for standup, if you don't mind, just dial me in [16:19:23] milimetric: called you but you no answer! [16:19:41] argh! sorry, call me back, that was my bad ottomata [16:24:12] milimetric: issue on restbase... I should have seen this one coming :( [16:24:12] https://github.com/wikimedia/restevent/blob/master/lib/queue.js [16:35:29] Analytics, Fundraising-Backlog, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners request-heavy cookie storm - https://phabricator.wikimedia.org/T117433#1774073 (AndyRussG) NEW [16:35:58] Analytics, Fundraising-Backlog, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774082 (AndyRussG) [16:37:02] is there a nice way to split a table up if there is nothing really to partition on in hive? :/ [16:38:30] Analytics, Fundraising-Backlog, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774095 (AndyRussG) Note: added the Analytics project since... [16:39:27] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774098 (AndyRussG) [16:40:41] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774073 (AndyRussG) [16:59:58] Analytics, Analytics-Kanban, Patch-For-Review: View counts in squid logs, webstatscollector 2.0 and hive are very dissimilar for several projects. [5 pts] - https://phabricator.wikimedia.org/T116609#1774229 (AndyRussG) For the curious: I've made two tasks about limiting requests or improving performanc... [17:00:55] Analytics, Analytics-Kanban, Discovery, EventBus, and 8 others: EventBus MVP - https://phabricator.wikimedia.org/T114443#1774236 (Ottomata) [17:01:29] Analytics-Kanban: Druid testing on labs to asses whether is a suitable Cassandra replacement. {slug} [8 pts] - https://phabricator.wikimedia.org/T116409#1774243 (Nuria) Open>Resolved [17:01:30] Analytics-Backlog: Data Store hardware procurement - 2015 - https://phabricator.wikimedia.org/T117008#1774244 (Nuria) [17:01:41] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Browser Report. Small bugfixes {lama} [5 pts] - https://phabricator.wikimedia.org/T116931#1774247 (Nuria) Open>Resolved [17:02:29] Analytics-Kanban: Improve record size on cassandra storage for pageview API data (RESTBase changes) {slug} [8 pts] - https://phabricator.wikimedia.org/T116209#1774259 (Nuria) Open>Resolved [17:02:48] Analytics-Kanban: Document Cassandra SLAS and storage requirements for daily and hourly data {slug} [5 pts] - https://phabricator.wikimedia.org/T116407#1774263 (Nuria) Open>Resolved [17:03:05] Analytics-Kanban, Patch-For-Review: Add lag option to reportupdater {frog} [8 pts] - https://phabricator.wikimedia.org/T117091#1774265 (Nuria) Open>Resolved [17:05:25] Analytics, Analytics-Kanban, Patch-For-Review: View counts in squid logs, webstatscollector 2.0 and hive are very dissimilar for several projects. [5 pts] - https://phabricator.wikimedia.org/T116609#1774271 (Nuria) a:Nuria [17:05:50] joal: not sure if you saw, but I restarted cassandra on aqs1002 this weekend [17:05:58] it had run out of heap [17:06:17] hey gwicke : I had not seen the restart but the errors [17:06:21] any idea about heap issue ? [17:06:41] I didn't look too closely [17:07:12] are there automated bulk writes? [17:07:30] if so, then that would be my suspicion, along with compaction perhaps being pushed a bit [17:10:06] gwicke: yeah, heavy monthly load this weekend [17:10:30] I have changed compaction today, exepected behavior on 1001, but on 1002 and 3, not yet caught vback up [17:11:07] hangouts froze for both of us [17:11:09] gwicke: We (Dan and me) also have a question about schema change, will ask after stanup :) [17:11:28] kk [17:14:41] Analytics, Analytics-Kanban, Patch-For-Review: View counts in squid logs, webstatscollector 2.0 and hive are very dissimilar for several projects. [5 pts] - https://phabricator.wikimedia.org/T116609#1774307 (ezachte) @AndyRussG thanks for chiming in. Now I understand what this is about. Does it make s... [17:17:47] milimetric: fyi, ee limn thing merged. [17:26:15] Analytics, Analytics-Kanban, Patch-For-Review: View counts in squid logs, webstatscollector 2.0 and hive are very dissimilar for several projects. [5 pts] - https://phabricator.wikimedia.org/T116609#1774367 (AndyRussG) >>! In T116609#1774307, @ezachte wrote: > Does it make sense to you that we have thi... [17:30:14] ended up getting a bit distracted over the last couple weeks and didn't get to turn it on, but i plan to turn on code that starts sending cirrus search logs to kafka this afternoon [17:30:18] (just FYI) [17:30:52] thx ottomata [17:35:53] ebernhardson: cool, let us know when its there and we'll check up on camus imports [17:35:59] i think everything is ready [17:37:13] joal: you gonna ask about the table versions in -services or you having a private thing? [17:38:00] joal: I'm going to change all the tables, including the 16-column articleFlat one? We should if we plan on putting monthly per-article data in there, but shouldn't otherwise 'cause it'll just use more space [17:38:08] milimetric: doing now, was having a minute off [17:38:27] sorry :) [17:38:30] didn't mean to bug [17:38:35] milimetric: there is a job for monthly per-article [17:38:41] oh! [17:38:49] i'll have to add that granularity then :) [17:38:54] I don't know if we can afford it though :) [17:39:08] So, many we just keep daily ? [17:39:16] hm .... [17:39:24] oh... :) [17:39:30] uh, kill it if it's taking resources man [17:39:37] ok, so let's never have it then [17:39:47] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774431 (ori) > When... [17:40:09] or maybe we can have it and just do it client-side by summing in JS [17:40:35] milimetric: why not [17:40:48] milimetric: currently double checking if I have a guestimate of data size [17:41:05] data size for what? [17:41:37] for monthly per aerticle [17:42:05] Analytics-Kanban: Analytics support for echo dashboard task {frog} [8 pts] - https://phabricator.wikimedia.org/T117220#1768995 (Milimetric) @matthiasmullie: the puppet change for this was deployed, so the report should run sometime soon. [17:42:38] joal: no don't worry about that, we don't need to use resources to satisfy that, I didn't even know we were trying to compute it, so I think it was just a miscommunication [17:42:38] milimetric: monthly per-article: 4.5G gziped - ~60G in cassandra [17:42:53] milimetric: Ok got it [17:43:06] milimetric: I'll keep monthly for per-project and op [17:43:07] yeah, not worth it, let's wait to see how many people ask for it and we can gauge if we want to do it client side [17:43:10] top [17:43:16] kk [17:43:28] ok, i'll add that granularity to per-project [17:43:28] gwicke: quick question about schema change [17:43:49] gwicke: can we convert from int to long ? [17:44:14] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774464 (Ejegg) Look... [17:45:13] joal: not in-place [17:45:18] afaik [17:46:09] it's cheap to add columns, though [17:46:46] so you could start writing new data as a long, and return longcolumn || intcolumn [17:46:51] gwicke: even if we change the version: of the table? [17:47:46] in restbase, we enforce backwards-compatible schema changes [17:47:58] hm .... Can we hack the thing ? [17:48:17] meaning, convert in cqlsh and deploy restbase with new version? [17:48:21] gwicke: --^ [17:48:28] there is no conversion support in cassandra [17:48:35] oh ? [17:48:38] you can write a script that reads one column & writes to another [17:48:47] but, that's not instantaneous [17:48:50] man .... Wouldn't have expectec that [17:49:05] it's not hard to add a line to return one field or the other [17:49:08] right, so we'd have to like move all the data to a new column, delete old one, recreate old name new type, move again, delete new [17:49:23] yeah, but it would constrain us even more on space [17:49:27] which we're running out of rapidly [17:49:27] milimetric: reasonably easy, small amount of data (projecxt only) [17:50:07] ok joal let's hack it then and we can keep the per-article counts the way they are for now, and change it if we hit problems [17:50:08] scary though [17:50:11] remember that columns only use space when they are non-null [17:50:26] so writing new columns only with the long doesn't use more space than those with the int [17:50:35] even if both columns are defined [17:53:57] milimetric: shall we go for dual columns, and then hack (copy) [17:53:58] ? [17:56:28] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774512 (Nuria) >How... [17:57:05] joal: I'd recommend to add a column & return either in the read path [17:57:23] Makes sense gwicke [17:57:30] milimetric: whatcha think ? [17:57:43] then run a script to port the old data, and finally drop the old column once that's done [17:58:23] Analytics-Kanban, Patch-For-Review: Exclude MobileMenu from Pageviews - https://phabricator.wikimedia.org/T117345#1774518 (Nuria) [18:02:32] joal: that works for me for the per-project table [18:02:38] but I'm not loving the idea for per-article [18:02:46] that'd be 32 columns, code would get pretty nasty [18:02:47] let's not change per-article [18:03:18] ok, then i'll change it to return from either column. gwicke, what's the best way to synchronize a schema change with a deploy? [18:03:24] will the deploy just do it for us? [18:04:01] milimetric: yeah, it'll figure out what to do [18:04:11] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774543 (Ejegg) We'r... [18:04:15] k, i'll have this PR up soon then [18:04:15] just remember to increment the version number [18:04:35] and don't drop the int column from the schema right away [18:05:49] yep, no worries, will do [18:06:08] joal: I'm gonna call the new column just "v" [18:06:09] milimetric: just as a check, highest values for daily are far from maxint, we are safwe :) [18:06:16] ok milimetric [18:06:21] upgrading the job as well [18:06:27] joal: how far? [18:06:54] highest we have inserted so far: 21553554 [18:07:56] So, if we don't show the main-page 100 times more often than we have done in october, we are safe :) [18:08:10] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774568 (AndyRussG)... [18:08:42] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774574 (ori) Why ar... [18:09:58] cool joal, ::phew:: [18:10:03] huhu:) [18:10:13] milimetric: I am sorry, I should thought and checked that before :( [18:10:34] sok, i should've too [18:10:41] i've been burned too many times by overflows in my life [18:10:51] yeah ... [18:11:22] milimetric: while at it, I'm trying to get null values instead of 0 in the per-article [18:14:11] joal: btw, did you consider using varints? [18:14:23] gwicke: we have not [18:15:01] they are a little bit slower to read, but at least you don't need to worry about limits [18:15:07] gwicke: I think it's averkill for our use:) [18:15:26] not sure if they end up being slower once you factor in compression [18:15:52] ints were too small, but long will be ok :) [18:16:16] I wouldn't be surprised if varints ended up with better compression overall [18:17:03] hm... gwicke ,I wonder :) [18:19:38] you can select them with 'varint' in the schema [18:27:37] gwicke: insertion on about wouldn't be that easy though [18:27:47] I think we'll stick with bigints :) [18:29:34] I don't think it would make a difference; the java driver should handle them too [18:30:03] it's a standard cassandra type [18:30:58] gwicke: java driver is no issue, the hadoop default CQLWriter is howver [18:32:17] And, while we could have gone the path rewriting / modifying it, we chose to use it with as few modifs as possible [18:32:38] makes sense [18:33:00] kind of ironic that the hadoop writer doesn't support varints [18:34:40] :) [18:35:11] gwicke: community hadoop support is kindy flaky --> use datastax enterprise edition ! [18:35:39] (PS11) Joal: Add refinery-cassandra module [analytics/refinery/source] - https://gerrit.wikimedia.org/r/232448 (https://phabricator.wikimedia.org/T108174) [18:35:46] joal: meh [18:41:18] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774779 (AndyRussG)... [18:49:08] halfak: Hello ! [18:49:16] hey joal! [18:49:30] Forgot to ping early this AM to get what I missed on Friday [18:49:34] Do we wait tomorrow's meeting to talk about the hive things ? [18:49:43] or now ? [18:49:47] halfak: --^ [18:49:52] I'm two steps deep into editing a thing. now is hard. [18:49:59] How much longer are you around? [18:50:00] tomorrow :) [18:50:05] Tomorrow it is! [18:50:18] That's actually a really good time slot to use :D [18:50:19] Will possibly stay a bit, but not being sure, tomorrow is however :) [18:56:14] Who would someone suggest to poke for this? https://gerrit.wikimedia.org/r/#/c/247866/ [18:58:26] Analytics-EventLogging, MediaWiki-extensions-RelatedArticles, MobileFrontend, Patch-For-Review, Reading Web Sprint 59 - Amsterdam and the hamsters: Upstream Schema.js from MobileFrontend to EventLogging - https://phabricator.wikimedia.org/T117140#1774825 (phuedx) @Nuria brings up valid concerns... [18:58:46] joal: "long" and "bigint" don't work :/ [18:58:55] milimetric: should be bigint [18:59:00] npm test fails saying long [18:59:05] sorry [18:59:08] no way :) [18:59:10] saying "can't create a column with that type" [18:59:28] but i guess that's sqlite [18:59:45] i seem to remember trying bigint and changing it to this for that reason, and forgetting to check up on it [19:07:27] gwicke: where do we file access requests for cassandra nodes? same process than other systems? [19:07:43] nuria: yeah [19:07:52] gwicke: ok, will do. [19:13:42] Guys, off for tonight [19:13:48] Will you tomorrow a-team :) [19:13:55] will SEE you ... [19:13:58] mwarf ... [19:14:04] have a nice night Joal [19:14:32] laters [19:14:59] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1774905 (faidon) I d... [19:25:26] Analytics-Kanban: Reformat pageview API responses to allow for status reports and messages {slug} - https://phabricator.wikimedia.org/T117017#1774952 (Milimetric) Addressed in this pull request: https://github.com/wikimedia/restbase/pull/396 @Ironholds: we can improve the situation a bit but not completely.... [20:24:09] Analytics-EventLogging, MediaWiki-extensions-RelatedArticles, MobileFrontend, Patch-For-Review, Reading Web Sprint 59 - Amsterdam and the hamsters: Upstream Schema.js from MobileFrontend to EventLogging - https://phabricator.wikimedia.org/T117140#1775202 (Jdlrobson) Yup. I've added a few more c... [20:34:22] Analytics, Beta-Cluster-Infrastructure: deployment-fluorine fails puppet '/usr/sbin/usermod -u 10003 datasets' returned 4: usermod: UID '10003' already exists - https://phabricator.wikimedia.org/T117028#1775278 (hashar) p:Triage>Normal [20:35:53] Analytics-Engineering, Beta-Cluster-Infrastructure, Patch-For-Review, Varnish, WorkType-Maintenance: On beta cluster varnish stats process points to production statsd - https://phabricator.wikimedia.org/T116898#1775285 (hashar) p:Triage>Normal [20:53:31] (CR) Christopher Johnson (WMDE): adds bulk sparql query and output scripts (2 comments) [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/248033 (owner: Christopher Johnson (WMDE)) [21:01:13] ottomata: hey! i'm in the hangout for our 1:1 [21:09:43] OH [21:09:44] hey coming [21:33:44] ottomata: back? [21:36:54] Analytics, Design: Collect font support metrics - https://phabricator.wikimedia.org/T108879#1775549 (Jdlrobson) [21:42:38] nuria: was in :1: w k [21:42:53] ottomata: looked at tornado and thus far looks good [21:42:56] hey, could someone remind me how to view event logging tables for beta labs? [21:42:59] i want to check some data is logging [21:43:07] ottomata: seems real easy to setup behind nginx, not apache though [21:43:17] jdlrobson: yes, check: [21:43:18] Analytics-EventLogging, Editing-Department, Improving access, QuickSurveys, and 5 others: QuickSurveys: Schema changes - https://phabricator.wikimedia.org/T114164#1775609 (Jdlrobson) Open>Resolved [21:43:40] nuria: ok cool [21:44:05] jdlrobson: https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/TestingOnBetaCluster#Database [21:44:11] machine is: [21:44:41] deployment-eventlogging03.eqiad.wmflabs jdlrobson [21:45:16] nuria: same credentials? [21:45:27] jdlrobson: user/pw for labs [21:45:31] for machine [21:45:56] ottomata: as it is async as so is nginx [21:47:16] nuria: oh so it's not stored in mysql? [21:47:34] jdlrobson: yes, i mean password to ssh is your usual labs one [21:48:03] nuri i ssh into deployment-eventlogging03.eqiad.wmflabs but i can't run mysql (access denied) [21:48:59] jdlrobson: let me try, you might need sudo which someone in labs has to give you a i guess [21:49:22] nuria: seems i don't have it [21:51:28] jdlrobson: ok, that's the issue. I do not think i can grant sudo there though, someone on labs might need to do it [21:51:32] jdlrobson: let me check [21:53:09] jdlrobson: i do not see anywhere to give you sudo so i gues syou need to ask in #wikimedia-labs [22:03:13] ok nuria will try my luck there [22:10:31] so [22:10:36] worth updating the wiki doc [22:11:12] the MySQL username / password for deployment-eventlogging03.eqiad.wmflabs is in labs/private.git under $passwords::mysql::eventlogging::password [22:11:40] Analytics-EventLogging, Editing-Department, Improving access, QuickSurveys, and 5 others: QuickSurveys: Schema changes - https://phabricator.wikimedia.org/T114164#1686646 (Jdlrobson) confirmed on deployment-eventlogging03.eqiad.wmflabs [22:17:47] Analytics, Design: Collect font support metrics - https://phabricator.wikimedia.org/T108879#1775724 (Edokter) [[ //en.wikipedia.org/wiki/Wikipedia:Typography | Wikipedia:Typography ]] links to some archived font surveys that break down the installed fonts for each of the three popular platforms (Windows,... [22:20:50] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1775760 (Pcoombe) Ju... [22:20:58] milimetric: yt? [22:21:03] hey yes [22:21:17] nuria: ^ [22:21:24] milimetric: should i setupp another meeting for wikistats technical stack? [22:21:44] nuria: I think we need to get through the Perl code first [22:21:50] milimetric: ok [22:21:54] so when we finish all those tasks, then I think we can plan more [22:23:41] Analytics-Backlog: Lading page to show what information is where - https://phabricator.wikimedia.org/T117496#1775766 (Nuria) NEW [22:30:37] Analytics-Kanban: Pageview API documentation for end users {slug} - https://phabricator.wikimedia.org/T117226#1775813 (Nuria) a:mforns [22:36:55] o/ nite lal [22:36:56] *all [22:39:09] Analytics-Backlog: Community has a Stats lading page with links - https://phabricator.wikimedia.org/T117496#1775831 (kevinator) [22:48:13] Analytics-Backlog: fix 'day' description in RESTBase Pageview API - https://phabricator.wikimedia.org/T117502#1775890 (kevinator) NEW [22:49:07] Analytics-Backlog: fix 'day' description in RESTBase Pageview API - https://phabricator.wikimedia.org/T117502#1775904 (kevinator) [23:23:51] Analytics-Backlog: Community has a Stats lading page with links - https://phabricator.wikimedia.org/T117496#1775970 (Nuria) Do we need a landing page so outside parties can find what info is where? Like pageview stats from vital-signs, edit info from wikistats?... [23:43:13] Analytics, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Performance-Team: Spike: CentralNotice: Investigate new ways to implement cross-domain banner close-button functionality, to end the Special:HideBanners cookie storm - https://phabricator.wikimedia.org/T117433#1776028 (Ejegg) One...