[07:45:38] hello people [07:45:45] we are stopping mysql on db1047 for maintenance [07:46:04] Hi a-team [07:58:29] hello :) [07:58:59] so the first issue with db1047 is that db1108 does not have enough space to get all the data that db1047 holds [07:59:22] the plan was to stop mysql on db1047, copy all mysql ibdata stuff to db1108 and then restart [07:59:35] but we are going to follow a different route [07:59:50] arf elukey :( [07:59:53] namely dropping s1/s2 replication databases from db1047 (those are already on dbstore1002) [08:00:14] keep log and other databases that people might need to save data from [08:00:19] and copy them over to db1108 [08:00:32] in the end both db1108 and db1047 will not have wiki replication [08:06:55] elukey: I wonder - if db1047 doesn't have enough space, wouldn't that be an isue? [08:07:28] nono it is db1108 that is not capable of doing both wiki replication and log db replication [08:08:45] joal: db1108 has ~4TB of space, db1047 6T [08:11:05] so for example on db1047 we have now a huge ibdata1 file of 2.4T [08:11:11] and several tokudb ones [08:11:23] we'll have to use mysqldump for the log database :( [08:12:01] hm elukey - Will we manage to fit eveythin we need in the new one? [08:12:19] oh yes for sure, the log database is ~1.5T [08:12:26] k [08:12:37] it is the amount of garbage that we don't need the issue :D [08:13:12] Ah - But i guess somebody needs it, right? [08:14:07] the wiki replications no, in fact we dropped them [08:14:21] there are other super old dbs/tables that might need to be copied [08:14:24] we asked in https://phabricator.wikimedia.org/T156844 [08:14:43] but I am 99% sure that we'll drop everything eventually :D [08:15:06] ok - It's just a shame to copy if we drop just after :) [08:16:36] we are going to move only the log db [08:16:39] via mysqldump [08:16:49] the c*** will remain only on poor db1047 :D [08:17:06] ok elukey - Sorry if I ask dumb questions, I have not been following the process at all [08:17:44] joal: no no please ask, I am writing down everything to get feedback [08:17:55] so we are on the same page [08:50:09] 10Analytics, 10Patch-For-Review, 10User-Elukey: Move away from jmxtrans in favor of prometheus jmx_exporter - https://phabricator.wikimedia.org/T175344#3711907 (10fgiunchedi) >>! In T175344#3709806, @Ottomata wrote: >> These seem to have a label with potentially high cardinality, do you know how node_id chan... [10:20:11] 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2017-10-10 (1.31.0-wmf.3)), 10Patch-For-Review: PageContentSaveComplete. Stop collecting - https://phabricator.wikimedia.org/T177101#3712019 (10Aklapper) [11:47:22] * elukey lunch! [12:58:43] db1108 is still copying... [13:00:42] elukey: we're not used to play with terabytes outside hadoop anymore ;) [13:02:04] milimetric: Good morning, when you're awaken and have time, I'd like to braindump my findings on WKS2 data quality :) [13:06:03] joal: let's do it! [13:06:20] milimetric: Arf - in 5 minutes? [13:06:40] joal: sure, np [13:10:55] Ready I am, milimetric :) [13:11:05] in da cave [13:11:09] OMW ! [13:23:57] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Mirror topics from main Kafka clusters (from main-eqiad) into jumbo-eqiad - https://phabricator.wikimedia.org/T177216#3712258 (10Ottomata) In T175344 @fgiunchedi wrote: > In this particular case for example aggregations on the metric (e.g. sum()) w... [13:37:29] hiii elukey got a second for metric naming brain bounce? [13:41:57] ottomata: sure! [13:43:46] bc? [13:43:50] sure [13:44:10] oh bc is busy [13:44:11] -2 [14:02:23] 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2017-10-10 (1.31.0-wmf.3)), 10Patch-For-Review: PageContentSaveComplete. Stop collecting - https://phabricator.wikimedia.org/T177101#3712350 (10elukey) currently blocked by the maintenance for T177405 [14:02:38] 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2017-10-10 (1.31.0-wmf.3)), 10Patch-For-Review: PageContentSaveComplete. Stop collecting - https://phabricator.wikimedia.org/T177101#3712352 (10elukey) 05Open>03stalled p:05Triage>03Normal [14:02:57] 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2017-10-10 (1.31.0-wmf.3)), 10Patch-For-Review, 10User-Elukey: PageContentSaveComplete. Stop collecting - https://phabricator.wikimedia.org/T177101#3647234 (10elukey) [14:11:22] elukey: your stuff is good, way better than .* :) [14:11:40] \o/ [14:11:40] the broker beans are structed v differen than the new consumer beans [14:19:40] maintenance finished for db1047, all good [14:19:51] db1108 is still not ready but going to be soonish [14:28:18] nice [14:29:17] two important things: we don't have anymore the s1/s2 replication and tables in db1047 [14:29:31] db1108 will not have them from the beginning, as established [14:29:48] so currently only dbstore1002 holds s1/s2 tables [14:32:03] elukey: any more comments on https://gerrit.wikimedia.org/r/#/c/384586/ [14:32:03] ? [14:35:06] running pcc now and checking the code [14:42:09] ottomata: only one doubt, mostly because of ignorance: what is create_resource in the profile? [14:43:34] elukey: i'm using that because i want to be able to configure mirror maker producer and consumer properties without making every single one a profiile param [14:43:56] create_resource allows me to pass a hash of properties to declare the mirror::instance [14:44:04] rather than hardcoding which ones i'm passing [14:44:08] that way the defaults from mirror::instance remain [14:44:15] if they are not set in $properties [14:48:10] this is a bit too advanced for me, I am reading confluent::kafka::mirrors but I have to admit that I don't get much [14:48:48] mirrors isnt' really used here [14:48:59] its just using mirror::instance [14:49:09] yes but is uses create_resources as well [14:49:16] yes [14:49:17] I was checking other examples [14:49:43] confluent mirrors class allows you to declare the mirror maker instances you want via params [14:49:59] that way, you don't have to make a new class for every cluster you want to consume into the aggregate cluster [14:50:12] but, we don't need that, since thus far we only ever mirror from cluster -> another [14:50:18] we haven't yet had any real 'aggregate' cases [14:50:27] where an aggregate cluster mirrors multiple other clusters [14:50:36] so, [14:50:55] the profile mirror class allows for only one mirror instance per host [14:50:57] so [14:51:10] consuming only from one source cluster [14:51:31] so, while both the profile mirror class, and the confluent mirrors class use create_resources [14:51:35] they are using it for different reasons :) [14:51:49] in profile mirror, its just to simplify parameterization overrides [14:52:08] for confluent mirrors, its to make it easier to setup multiple mirror maker instances on the same host [14:52:13] ah you could have used the direct instantiation [14:52:48] so you basically do deep_merge of defaults + hiera, and then pass them to the class [14:52:58] yup [14:53:33] 10Analytics-Kanban: Fix mediawiki history page reconstruction bug on move-conflict - https://phabricator.wikimedia.org/T179074#3712458 (10JAllemandou) [14:53:51] and this works because the keys in the hash have the same name as the parameters? [14:53:52] milimetric: if you have a minute to proof-read --^ [14:53:54] oh my comment on line 109 there trailed off, guess i never finsihed that thought [14:54:00] adding that might help elukey.. [14:54:00] doing [14:54:20] * milimetric reads joal's bug description [14:58:43] 10Analytics-Kanban: Fix mediawiki history page reconstruction bug on move-conflict - https://phabricator.wikimedia.org/T179074#3712472 (10Milimetric) [14:58:56] ottomata: read the comment, it seems a neat trick but I'd have personally not have used defaults and mentioned them in hiera (to see exactly what ends up in the class' parameters etc..) [14:59:06] having said that, I am not blocking the change of course [14:59:23] but I hate hash merges with default :D [15:00:45] lemme comment the cr [15:02:09] ping ottomata [15:02:56] ah [15:25:53] elukey: not using create_resources means I have to add another 8ish parameters right now, possibly more later if we want to override other things [15:29:29] * elukey nods [15:29:51] don't want to block it, it was a bit "condensated" but I got the main idea, so I am fine with it :) [15:30:58] ok [15:35:13] 10Analytics-Kanban: Fix mediawiki history page reconstruction bug on move-conflict - https://phabricator.wikimedia.org/T179074#3712458 (10mforns) We can try to use the same timestamp (minimum between revision timestamp and move timestamp) to correct both the redirect page creation and the original page move event. [15:48:29] elukey: AHHHHHHHGGHGHGHGHG [15:48:31] https://kafka.apache.org/documentation/#upgrade_11_0_0 [15:48:35] However, if your brokers are older than 0.10.0, you must upgrade all the brokers in the Kafka cluster before upgrading your clients. [15:48:57] i think the mirror maker consumer i'm using won't work as is with main-eqiad [15:48:59] AGRGH [15:49:12] getting the good ol' [15:49:13] [2017-10-26 15:48:58,199] ERROR Processor got uncaught exception. (kafka.network.Processor) [15:49:13] java.lang.ArrayIndexOutOfBoundsException [15:54:23] OOF [15:54:53] uffffffffff [15:58:04] yargh [15:58:06] not sure what to do about this. [15:58:07] hm. [15:58:21] i should have tested in labs with the same setup as prod [15:58:27] i.e. consuming from a 0.9 cluster [15:58:29] ROOKIE MISTAKE [15:59:53] elukey: we could run the old mirror maker from elsewhere [15:59:57] rather than colocated with the new cluster [16:00:02] until we upgrade main clusters too [16:00:04] or [16:00:13] we could schedule a main cluster upgrade sooner rather than later (yeah right! :p) [16:00:28] i think those are our only options [16:00:52] Ack, no [16:00:55] option 1 there won't work either [16:01:10] hmmm [16:01:15] lemme try real quick i guess [16:04:20] milimetric, mforns: About the accepted-difference between move-event-timestamp and revision-timestamp [16:04:37] milimetric, mforns: Do we make this configurable with a default of say, 5 seconds? [16:05:18] joal: I don't see why it would almost ever be more than 2 seconds [16:05:27] :D [16:05:31] it's only if there's a major problem with the app servers [16:05:39] milimetric: I'm actually going to try to measure that [16:06:25] the question is, what are the cases where it would be a bad match if the tolerance is too high [16:06:27] 10Analytics-Kanban, 10EventBus, 10Services (next): Malformed HTTP message in EventBus logs - https://phabricator.wikimedia.org/T178983#3712704 (10fdans) [16:10:16] 10Analytics-Kanban, 10Analytics-Wikistats: Beta Release: Support Annotations on Wikistats 2.0 graphs - https://phabricator.wikimedia.org/T178813#3712709 (10fdans) [16:10:45] ok, elukey producer seems to work. [16:10:59] if i use old mirror maker [16:11:02] \o/ [16:11:04] so we can run the mirror instance elsewhere [16:11:10] not sure where though. [16:11:33] so, that will hold us over for mirroring from main -> jumbo [16:11:37] but when we do upgrade main [16:11:40] we will have to be really careful [16:11:47] and remember to move the mirror maker to the new version [16:12:33] yargh [16:12:36] so annoying [16:13:41] all that prometheus work i just did won't be used until after we upgrade main! [16:13:42] gah! [16:14:39] elukey: where do you think we should run the mirror maker instances? [16:14:41] on analytics brokers? [16:14:59] that would mean we can't decom them even after we've migrated all clients to jumbo [16:16:05] that would be annoying but it might be the only viable solution.. [16:17:11] we coudl decom some of them, we don't need instances on all 6 [16:17:22] ok, puppetizing that :/ [16:21:11] 10Analytics, 10Analytics-Wikistats: Vital Signs: Please provide an "all languages" de-duplicated stream for the Community/Content groups of metrics - https://phabricator.wikimedia.org/T120037#3712741 (10fdans) [16:22:21] 10Analytics, 10Analytics-Wikistats: Vital Signs: Please provide an "all languages" de-duplicated stream for the Community/Content groups of metrics - https://phabricator.wikimedia.org/T120037#1843258 (10fdans) Rolling active editors doesn't exist as a metric in the new Wikistats. However, we have all-projects:... [16:22:56] 10Analytics, 10Analytics-Wikistats: Vital Signs: Please make the data for enwiki and other big wikis less sad, and not just be missing for most days - https://phabricator.wikimedia.org/T120036#3712746 (10fdans) [16:24:22] 10Analytics, 10Analytics-Wikistats: Vital Signs: Please make the data for enwiki and other big wikis less sad, and not just be missing for most days - https://phabricator.wikimedia.org/T120036#1843248 (10fdans) The new Wikistats has gapless data for big (and small) wikis from the beginning of time. {F10451922} [16:28:37] 10Analytics-Cluster, 10Analytics-Kanban: Beeline does not print full stack traces when a query fails {hawk} - https://phabricator.wikimedia.org/T136858#3712774 (10fdans) a:03Milimetric [16:31:49] 10Analytics-EventLogging, 10Analytics-Kanban: Find an alternative query interface for eventlogging on analytics cluster that can replace MariaDB - https://phabricator.wikimedia.org/T159170#3712797 (10fdans) [16:35:37] 10Analytics: Sqoop wbc_entity_usage from all wikis into hadoop (HDFS) - https://phabricator.wikimedia.org/T167290#3712828 (10fdans) @Addshore do you think you'll work on productionising this anytime soon? [16:44:18] elukey: https://gerrit.wikimedia.org/r/#/c/386648/ [16:45:32] ottomata: gtg buut I'll read tomorrow sorry! :( [16:47:31] ok [16:49:25] Gone for dinner, back after [16:53:59] mforns: you mentioned you are working on eventlogging refine next week [16:54:22] i forget, are you working on getting the refine hive job productionized ? or just the druid part right now? [16:54:48] ottomata, no just the eventlogging to druid [16:56:09] aye ok [16:56:29] so i can try and pick up some of the hive productionize part then i guess, if i'm still blocked on kafka cergen [16:56:36] might need some joal help a little bit [16:56:38] we will see :) [16:58:41] !log now mirroring main Kafka cluster topics to jumbo Kafka cluster,  with MirrorMaker instances running on analytics-eqiad broker nodes. https://phabricator.wikimedia.org/T177216 [16:58:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:59:56] ok [17:00:01] ottomata: could pick up on our earlier conversation if you have a minute. [17:02:15] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Mirror topics from main Kafka clusters (from main-eqiad) into jumbo-eqiad - https://phabricator.wikimedia.org/T177216#3712901 (10Ottomata) Crap crackers. From https://kafka.apache.org/documentation/#upgrade_11_0_0 > if your brokers are older than... [17:02:19] Krinkle: ya [17:04:17] Krinkle: we left off with me saying that eventlogging topics weren't available in codfw [17:04:18] :/ [17:04:46] ottomata: Yeah, so for now consumption would be tied to eqiad kafka. [17:04:49] That's okay. [17:05:10] But that leaves the question of how both are configured to produce to the "main" statsd, without both doing the same thing. [17:05:51] We could either somehow distribute the work (and waste a bit of cross-DC latency for 1/2 packets), or have the switch be manual and just active/inactive. [17:05:58] so, you want to have an eqiad hafnium and codfw hafnium, running the same stuff, but in active/inactive [17:06:00] ya [17:06:05] The former is preferred so that if either goes down, the other continues. [17:06:12] you could do that [17:06:18] actually [17:06:22] But I'm not sure how to do that without comitting offsets? [17:06:22] these topics have only one partition [17:06:31] oh ya you'd have to commit offsets and use a consumer group [17:06:34] which is fine [17:06:36] right? [17:06:49] Well, I want to start from 'now' on restart, not where I left off. [17:06:51] because statsd.. [17:06:53] (for now) [17:06:58] ...because these topics have one parititon, you can have 2 consumers in your group, but one would be doing nothing at any given time [17:07:15] but, you wouldn't have control over which consumer process was busy [17:07:18] That's okay. It will still start sending to the other one if the random/current one stops, right? [17:07:23] so you might be consuming in codfw most of the time [17:07:27] right [17:07:34] That's totally fine as far as I'm concerned. [17:07:36] ok cool [17:07:39] as for offset commits [17:07:39] hm [17:07:45] yeah statsd timestamp stuff [17:07:45] hm [17:07:57] Is it supported to have a consumer group and commit offsets, but ignore them on startup? [17:08:04] you coudl produce directly to graphite instead? and set the timestamp? [17:08:09] hmm [17:08:12] Nope, need the expanded properties. [17:08:18] .rate, .median, .p99 etc. [17:08:19] aye [17:08:33] There's a larger work on-going to switch to Prometheus and abadoned the whole Graphite [17:08:37] but that's separate. [17:08:46] We'd want both running for a good year regardless for continuity. [17:09:06] hmmmmm [17:09:14] i wonder if you actually have to commit offsets.... [17:09:16] you might now [17:09:18] not* [17:09:29] And then there's also statsv, which would work the same way, except it doesn't involve EventLogging, so might be different. [17:09:35] I guess that one might have a codfw mirror. [17:09:41] it should, it doesn't now [17:09:49] we coudl move the statv topic to the main cluster instead of the 'analytics' cluster [17:09:54] but we wouldn't want to use that in the proposed model since they'd need to be aware of each other consuming. [17:09:55] then it would [17:09:58] aye [17:10:04] anyway, for offsets [17:10:08] Yeah :) [17:10:08] we'd ahve to try and see [17:10:09] but [17:10:14] i think if you turn off auto.offset.commit [17:10:17] and never commit yourself [17:10:22] but still use a consumer group [17:10:27] kafka will balance your consumers [17:10:31] and whenever a rebalance happens [17:10:35] (or restart) [17:10:50] just use the latest offset committed value (nothing there), or the value of auto.offset.reset=largest [17:10:58] Right [17:11:10] not 100% on that, but i think that is how it should work [17:11:46] I'll try with two simple python processes consuming from something that has UUIDs and write them to different text files and make sure that when killling them randomly and when letting run for a while, how often it rebalances by itself, and how it reacts when a client dies. [17:12:01] I imagein without the offsets, it will probably find out "late" that a client dies? [17:12:02] aye cool [17:12:21] hmm, not sure on that either, but i don't think offset commits are tied to client aliveness [17:12:23] i think there is a heartbeat [17:12:26] Although I assume it still does TCP to verify an individual packet was received? [17:12:39] And consider failed if it failed? [17:12:40] its totally expected that a client might want to control when offsets are committed [17:12:49] Oh well. [17:13:02] so i doubt kafka would rebalance just beacuse it hasn't seen a committed offset in a while [17:13:16] Anyway, I'll start with active/inactive. And then tinker with changing it. Wanna get off hafnium first and into the new VMs. [17:14:07] heartbeat.interval.ms [17:14:11] default 3 seconds [17:14:21] . Heartbeats are used to ensure that the consumer's session stays active and to facilitate rebalancing when new consumers join or leave the group. [17:14:40] oh actually [17:14:40] session.timeout.ms [17:14:42] might be more relevant [17:14:46] The timeout used to detect consumer failures when using Kafka's group management facility.  [17:14:50] 10 seconds [17:15:02] The consumer sends periodic heartbeats to indicate its liveness to the broker. If no heartbeats are received by the broker before the expiration of this session timeout, then the broker will remove this consumer from the group and initiate a rebalance. [17:15:20] Ah, coo. [17:15:23] Perfect. [17:15:30] That's good enough for this purpose. [17:15:31] so, with that default, and no offset commit, you'd lose 10 seconds of metrics if a consumer dies [17:15:38] I might lower it a bit. [17:15:40] aye [17:15:43] But this mechanism sounds good enough. [17:15:45] as for statsv...we shoudl move that to main [17:15:51] i'll make a ticket [17:15:54] Thx [17:16:01] i was getting ready to moe that client to the new jumbo cluster (which deprecates the analytics one) [17:16:04] but it should go to main for sure [17:18:47] 10Analytics, 10Analytics-Cluster: Support multi DC statsv - https://phabricator.wikimedia.org/T179093#3713041 (10Ottomata) [17:18:57] 10Analytics-Cluster, 10Analytics-Kanban: Port Clients to new jumbo cluster - https://phabricator.wikimedia.org/T175461#3713055 (10Ottomata) [17:19:00] 10Analytics, 10Analytics-Cluster: Support multi DC statsv - https://phabricator.wikimedia.org/T179093#3713054 (10Ottomata) [17:19:20] 10Analytics, 10Analytics-Cluster: Support multi DC statsv - https://phabricator.wikimedia.org/T179093#3713041 (10Ottomata) [17:19:36] 10Analytics-Cluster, 10Analytics-Kanban: Port Clients to new jumbo cluster - https://phabricator.wikimedia.org/T175461#3594088 (10Ottomata) [17:19:38] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Port statsv to kafka-jumbo - https://phabricator.wikimedia.org/T176352#3713059 (10Ottomata) 05Open>03declined Declining in favor of T179093 [17:21:29] ottomata, btw, I can not make the http request work from stat1005 to druid, is that expected? it times out [17:21:53] oh [17:21:57] probably analytics cluster vlan rules [17:22:03] mforns: on broker port? [17:22:16] ooooh, I forgot the port... [17:22:18] one sec [17:22:23] mforns: you using hte lvs uri? [17:22:25] oh [17:22:27] which druid cluster? [17:22:27] haha [17:23:14] ottomata, cool, it works now, but I get a "peer not authenticated" error [17:23:25] druid1001.eqiad.wmnet [17:23:41] port? [17:23:50] 809 [17:23:52] 8090 [17:24:08] mforns: curl druid1001.eqiad.wmnet:8090/status | jq [17:24:11] works for me [17:24:14] from stat1005 [17:24:36] | jq . [17:24:40] aha [17:24:41] or just [17:24:41]  curl  druid1001.eqiad.wmnet:8090/status [17:25:37] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Mirror topics from main Kafka clusters (from main-eqiad) into jumbo-eqiad - https://phabricator.wikimedia.org/T177216#3713110 (10Ottomata) [17:25:46] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Mirror topics from main Kafka clusters (from main-eqiad) into jumbo-eqiad - https://phabricator.wikimedia.org/T177216#3650677 (10Ottomata) FYI, https://grafana.wikimedia.org/dashboard/db/kafka-mirrormaker?orgId=1&var-instance=main-eqiad_to_jumbo-eq... [17:27:15] ottomata, yea works for me too, will look into why spark-shell returns auth error [17:27:59] thanks! [17:29:15] 10Analytics, 10DBA, 10Data-Services, 10Research, 10cloud-services-team (Kanban): Implement technical details and process for "datasets_p" on wikireplica hosts - https://phabricator.wikimedia.org/T173511#3713170 (10bd808) >>! In T173511#3558849, @bd808 wrote: > The urgency for `datasets_p` is that finding... [17:30:02] ottomata, Ok I got it, was my fault :P thx! [18:00:02] Hey ottomata - Was gone for diner - How may I help? [18:01:12] oh joal nothing right now, sorry didn't mean to ping [18:01:21] but, i might work on productinoizing eventlogging refine next week [18:01:29] Ah, nice ! [18:01:31] so will need final reviews, maybe even some weird spark debugging if things get funky [18:01:40] Sure, sounds great :! [18:01:51] ottomata: I've been asking for that for long enough ;) [18:10:04] 10Analytics-Kanban, 10RESTBase-API, 10Services (later), 10User-mobrovac: Expose pageview data in each project's REST API - https://phabricator.wikimedia.org/T119094#3713346 (10Nuria) 05Open>03declined [18:39:26] RECOVERY - HDFS capacity used percentage on analytics1001 is OK: OK: Less than 60.00% above the threshold [85.0] [18:40:08] \o/ [18:49:32] elukey: ya, puffff [18:52:52] phew [19:13:39] 10Analytics-Kanban, 10EventBus, 10Services (next): Malformed HTTP message in EventBus logs - https://phabricator.wikimedia.org/T178983#3709298 (10Ottomata) Pretty sure this isn't caused by Tornado's `max_buffer_size`. From what I can tell, this defaults to 100M. Also, I just tried to submitted some really... [19:15:55] * joal is a bit drunk now :) [19:18:01] hah [20:01:23] Question for anybody online: Do you know if there are any tools in analytics-refinery for dealing with page title normalization and cleanup? [20:01:43] I have one data source with namespaces embedded in title and another without. [20:02:15] I imagine going back and forth with this and dealing with underscores vs percents 20 vs spaces is a very common problem. [20:11:24] 10Analytics, 10Pageviews-API: Endpoints that 404 no longer have "Access-Control-Allow-Origin" header - https://phabricator.wikimedia.org/T179113#3713646 (10MusikAnimal) [20:11:38] 10Analytics, 10Pageviews-API: Endpoints that 404 no longer have the "Access-Control-Allow-Origin" header - https://phabricator.wikimedia.org/T179113#3713659 (10MusikAnimal) [20:11:57] Shilad: there is some code to that extent, yes, give me a sec [20:12:14] Shilad: BTW did you do chnages to your job to note that it will run only on top of desktop data? [20:13:24] Shilad: these https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/PageviewDefinition.java#L386 [20:13:25] ? [20:15:04] nuria_: Yes! I made those changes. Thanks for them. [20:15:56] 10Analytics, 10Pageviews-API: Endpoints that 404 no longer have the "Access-Control-Allow-Origin" header - https://phabricator.wikimedia.org/T179113#3713665 (10MusikAnimal) [20:16:18] Shilad: ok, take a look at link, that class needs some refcatoring as those methods are not available for use [20:17:06] *refactoring [20:19:24] This is helpful. Thanks! [20:20:14] nuria_: I tried to leave responses to your code review, but I think I may not have the gerrit workflow quite right. I have a feeling my responses were never properly saved. [20:20:18] Sorry about that. [20:21:26] Shilad: you need to hit "reply" on main review page so they are publisged [20:21:29] *published [20:21:42] Shilad: making comments just stores them until you hit "reply" [20:22:07] "stores them w/o them being visible, that is" [20:22:37] Aha! That was definitely the problem. Next time I'll get it right. [20:34:15] (03PS1) 10Nuria: Configure dygraphs so Y-axis starts at '0' [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/386708 (https://phabricator.wikimedia.org/T178602) [20:35:21] fdans, milimetric : if you are still there 1 line code change [20:36:23] fdans, milimetric : https://gerrit.wikimedia.org/r/#/c/386708/ [20:40:36] nuria_: it sounds like it should be valueRange: [0, null], but [0] is kind of the same thing I guess. Ok, merging [20:40:43] (03CR) 10Milimetric: [V: 032 C: 032] Configure dygraphs so Y-axis starts at '0' [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/386708 (https://phabricator.wikimedia.org/T178602) (owner: 10Nuria) [20:41:05] milimetric: k [21:01:03] 10Analytics-Dashiki, 10Analytics-Kanban, 10Patch-For-Review: Add option to not truncate Y-axis - https://phabricator.wikimedia.org/T178602#3713820 (10Nuria) Deployed now: https://page-creation.wmflabs.org/#projects=enwiki/metrics=Daily%20Pages%20Created [21:08:50] Nettrom: let me know if this looks good y-axis wise: https://page-creation.wmflabs.org/#projects=enwiki/metrics=Daily%20Pages%20Created [21:10:13] nuria_: I checked some of the other graphs as well, looks great to me! Thanks so much for patching this so quickly! [21:10:34] Nettrom: np, dygraphs did it all really [21:10:56] easy fixes are the best kind of fixes :) [21:11:00] (maybe) [21:26:24] 10Analytics-Dashiki, 10Analytics-Kanban, 10Patch-For-Review: Add option to not truncate Y-axis - https://phabricator.wikimedia.org/T178602#3713841 (10Nettrom) 05Open>03Resolved Looks good to me, thanks again!