[03:36:47] drdee: ping [03:36:53] hey [03:36:54] drdee: is the time of the meeting GMT ? [03:37:07] drdee: does Google tell me the localized time of the meeting ? [03:37:18] hold o [03:37:42] it's 2pm EST [03:37:44] ok [03:37:50] so 8pm GMT [03:38:04] but i'll ping you 15 minutes before it starts [03:38:04] oh, alright [03:38:12] so I still have time [03:38:17] any progress today? [03:38:28] did you see the sheet I sent you [03:38:28] ? [03:38:44] yes I saw it, I see the discrepancies [03:39:07] nothing palpable yet, I'm working on something to generate log lines so I can generate them automatically from the tests [03:39:27] I'll put my alarm and a song on my speakers so I can wake up at 2pm :D [03:39:30] but I'll be at the meeting [03:39:44] I think I have some material to discuss on there [03:40:07] but maybe Tilman will perhaps be mainly interested in Mobile [03:40:24] and I received a forwarded email from Erik which was from Amit and Amit is also interested in Mobile I think [03:40:36] I should fix that mobile table I think it's empty now [03:47:21] ottomata!!!!!!!!! [03:47:54] yooy [03:55:03] drdee: 2pm EST != 8pm UTC [03:55:13] we are currently -0500 [03:55:37] average_drifter: ^ [03:55:53] what is average_drifter's TZ? [04:51:20] GMT+2 [05:12:28] ok, Google says 9PM -> 9:30PM Bucharest time :) got it [05:12:29] thanks [07:58:26] Generate::Squid is ready [07:58:34] conquering the bugs with it [08:40:49] morning hashar [08:40:59] oh you are still awake :) [08:41:02] hello :-D [08:41:06] I am awake awake all the way [08:41:24] I'll get maté this week so I'll be more awake than ever before [08:41:33] http://en.wikipedia.org/wiki/Mate_(beverage) [08:41:35] this one [08:41:53] hashar: I am in need of help on jenkins [08:41:53] never heard of it before [08:42:37] (so maté seems to be the south america tea) [08:42:42] average_drifter: i will be glad to help [08:42:55] i also need to migrate all your jenkins jobs to use the new triggering system [08:43:22] hashar: about new triggering, can you please explain what kind of triggering ? [08:43:32] hashar: currently they are properly triggered [08:43:40] I'm probably missing something [08:43:42] I have been working on a new gateway between Gerrit and Jenkins [08:43:53] the "Gerrit trigger plugin" in Jenkins does not fit our needs [08:44:19] so I have migrated to a python daemon known as Zuul which listen for Gerrit events and trigger Jenkins jobs based on a specification [08:44:25] that let us finely tune how we trigger jobs [08:44:31] in short: it is easier to handle :) [08:45:03] average_drifter: anyway, what do you need help with ? :-D [08:45:22] hmm [08:45:27] interesting with this new Zuul [08:45:40] so have you done testing on some git repos with it and it works well ? [08:45:59] yeah all MediaWiki extensions and MediaWiki core are now triggered by Zuul [08:46:08] nice, and they are all happy ? [08:46:25] mostly :) [08:46:46] I am also now using a python script to generate the jenkins jobs [08:47:00] also based on a spec file ? [08:47:11] so I just have to edit a .yaml file, run the script and the jobs are updated. Save me from clicking / modifying jobs manually [08:47:20] maybe I should do an office hour [08:47:22] so I can actually just write a spec file now in my repo and some Python script will find my repo and add it to jenkins ? Can I hope for that ? :) [08:47:31] yeah exactly [08:47:43] average_drifter: the confs are at https://gerrit.wikimedia.org/r/gitweb?p=integration/jenkins-job-builder-config.git;a=tree [08:47:49] I have not documented it yet though :/ [08:48:18] hashar: I have only one concern regarding that [08:48:43] hashar: I am hoping that it does not repeatedly rewrite the jenkins jobs, I mean, if something breaks [08:48:52] we'll probably have to edit the job manually [08:48:57] and then update the YAML spec [08:48:59] right ? [08:49:20] I am not sure I understand the concern [08:49:38] hashar: ok let me put it this way [08:49:40] if something breaks while generating the job, we have to fix the YAML spec and rerun the script [08:50:11] hashar: is anyone allowed to rerun the script for their own REPO ? [08:50:19] ahh [08:50:23] :) [08:50:37] it depends on the workflow used to trigger jobs [08:50:40] that is done in Zuul [08:50:55] on core, the lint checks can be retriggered by entering in Gerrit comment box "recheck" [08:51:02] that is detected as a keyword by Zuul [08:51:19] the unit tests are triggered anytime a comment is sent with Code-Review: +2 [08:52:27] hm, alright [08:52:39] what I wanted to know is just the following [08:53:02] let's say something breaks. can I fix my own yaml ? or will I have to talk to someone to fix/commit/something for me ? [08:53:58] :) [08:54:07] ah [08:54:22] ou will have to edit the yaml file and send your patch to Gerrit [08:54:30] have it reviewed / merged [08:54:45] you can also run the script from your laptop too [08:54:51] does that not add delays to fixing a problem ? can it be bypassed for urgent fixes ? [08:55:13] I will make it to automatically regenerate job whenever a change is merged [08:55:37] does regenerating a job imply re-pulling the whooole git repo ? [08:55:43] and as for urgency, I believe that whenever someone will make a change, he will also be able to fix it right away [08:55:56] no that is only regenerating the jenkins job configuration [08:56:04] the working space is still there unless the job name change [08:56:14] and if it changes, you mv the dir ? [08:56:19] by dir I mean the workspace [08:57:08] well we could move the dir manually [08:57:17] anyway, refetching the git repo is not much of an issue [08:57:25] most of our repos are really small [08:57:44] but won't that leave many zombie unused dirs around and eventually fill up the disk? [08:58:15] we have like half a TByte :) [08:58:21] and we don't rename jobs that often anyway [08:58:25] :D [08:58:27] you are right [08:58:30] I was just nitpicking :D [08:58:42] that is a good objection though :-] [08:59:13] yes, but it's kinda "micro-optimization" in a sense :) [08:59:38] so , I wanted to ask if you could help me with adding some Perl modules to jenkins please ? [08:59:58] it's not complicated, but it requires some stuff [09:00:32] hashar: are you familiar with Ruby ? I guess you are right.. [09:00:38] since you're playing with puppet :) [09:00:44] hashar: do you know rbenv ? [09:00:49] well puppet is a DSL on top of ruby [09:00:53] yes [09:00:56] i know a tiny bit of ruby but not that much [09:00:59] hashar: so you like ruby and know rbenv right ? [09:01:01] I am huge fan of rake though :-] [09:01:07] and never heard of rbenv [09:01:18] hashar: ok [09:01:29] hashar: I want to make an analogy so I'm now searching for the bridge [09:01:48] hashar: uhm, have you played with pip in Python ? [09:02:27] a bit :-] [09:02:30] great !!! [09:02:38] we don't allow it in production though [09:02:39] we have something in Perl that is the equivalent of pip [09:02:45] in Perl-land it's called local::lib [09:03:10] in some of the tests I'm writing for wikistats I need some modules from CPAN [09:03:22] the only big problem is ... they are not on jenkins [09:03:39] can't you ship the modules in the wikistats repo ? [09:03:50] hashar: no beacause there's a huge dep-chain [09:03:53] something like perl -Ilib [09:04:07] huge dep-chain, would have done that if it was possible, I know what you mean [09:04:16] and what are the modules you want ? They most probably are in Ubuntu Precise already [09:04:26] like Test::more or something? [09:04:30] moment [09:06:22] HTML::TreeBuilder::XPath , HTML::TreeBuilder , Web::Scraper [09:06:25] these are the modules [09:06:27] hashar: ^^ [09:08:36] libhtml-treebuilder-xpath-perl , libweb-scraper-perl [09:08:44] these are two packages I've found which provide the needed modules [09:08:59] hashar: but the problem is actually that these packages are probably old and they don't contain the latest fixes from CPAN [09:09:25] hashar: I have a fix in mind which would be the following. Let's say I have a file in my repo called required_cpan_module [09:09:44] sorry was fixing another change [09:10:23] hashar: every time I git review or every time I git push , would it be possible that Zuul or someone read this required_cpan_modules file and just run cpanm for every line in there (cpanm is the equivalent of pip in a way) ? [09:10:36] so it would install every module described in required_cpan_module [09:10:56] in other words like a Chef cookbook ! I think that's what it's called rihgt ? [09:11:04] we can not do that sorry [09:11:06] it just says the packages and and chef installs them [09:11:16] hashar: ok [09:11:19] that would require the script to fetch code from an untrusted repository [09:11:25] that will run as root on the server :-] [09:11:30] hashar: ok you are right [09:11:32] or jenkins [09:11:34] or whatever :-] [09:11:42] hashar: can we go for the packages then ? [09:11:52] we should install the Ubuntu packages. [09:12:04] if a newer version of the package is required, we could have it packaged for Ubuntu [09:12:19] I think there are scripts to automatically convert a CPAN module to a deb package :-D [09:14:10] 11:08 < average_drifter> libhtml-treebuilder-xpath-perl , libweb-scraper-perl [09:14:15] they are already in Ubuntu as packages [09:14:26] nice [09:14:35] so we need to update the puppet manifest :-] [09:14:50] the puppet manifest ? [09:14:59] oh ! like we did last time right ? [09:15:02] with them other packages [09:15:05] yeah [09:15:13] so basically whenever you want something installed on a production box [09:15:17] you HAVE TO use puppet :-] [09:15:22] (cause none of us are root hehe) [09:15:33] then it is validated by peers and or a root user [09:15:52] sounds good [09:16:04] moderately good :) [09:16:14] I will find the operations repo to see where I can fit those [09:16:15] you will have to add the packages to manifests/misc/contint.pp [09:16:40] then look for the $CI_DEV_packages array which contains misc packages, you can yours there :-] [09:16:41] and [09:16:52] I can help you writing the update :-D [09:17:33] if your editor does not have syntax highlighting for puppet manifests, you can fallback to ruby syntax highlight, that should be good enough [09:20:20] average_drifter: also if you want to verify your new puppet manifest is fine, you can install the puppet ruby gem on your machine [09:20:32] and then run : puppet parser validate manifests/misc/contint.pp [09:20:42] I will try that [09:20:43] (or send it to Gerrit and Jenkins will do that for you) [09:20:46] gem install [09:20:51] puppet [09:20:55] ok [09:20:56] I think [09:21:21] yeah that is what `gem list --local` gives me [09:21:27] can I have two package{} in one single class ? [09:22:27] package{} declares a Package resource in puppet [09:22:36] and yeah you can multiple of them [09:22:39] cool ! [09:22:47] in this specific case you could just amend the array $CI_DEV_packages [09:22:58] which is then used to package { $CI_DEV_packages: ensure => present; } [09:23:10] (aka make sure all packages in the CI_DEV_packages array are installed on the server). [09:23:12] so that is an easy change [09:23:22] may I use the misc::contint::analytics::packages class please ? [09:23:25] it is named after our team [09:23:39] and it offers explicit information about who uses the packages I will add [09:26:10] sure :) [09:26:23] I did not know about that class [09:26:32] I really need to switch that huge manifests to a puppet module :-] [09:30:30] ok I added them [09:31:22] send to gerrit and paste me the gerrit change URL :-]  I will have a look at it [09:31:26] coffee break meanwhile [09:32:05] ok [09:32:33] https://gerrit.wikimedia.org/r/37800 [09:35:21] now we need ops to merge the change and have it applied on gallium (the cont int server) [09:35:46] usually ping in #wikimedia-operations [09:51:08] hashar: there is one more thing [09:51:16] hashar: we need a bridge between jenkins and stat1 [09:51:40] hashar: so that after each new changeset it runs a few days a data through wikistats [09:51:51] hashar: so we get to see some reports in the browser for that particular changeset [09:51:55] so what I mean is [09:53:52] git review ===> change gets to gerrit ===> gerrit tells jenkins to run it ===> jenkins runs stuff, and at some point talks to stat1 and it tells him "hey gimme some data for 2012-10-01 so I can run it for you" ===> jenkins runs reports for 1 full day of data ===> data is exposed through a web server so we can see it in our browsers [09:54:40] cant you ship a small test of data ? [09:54:47] agrmbm [09:55:13] can't you add in your repository a small set of data that would be used to unit test your script ? [09:55:25] hashar: yes, that is possible, but we should be able to get a clear view of what the gerrit change has caused [09:55:51] hashar: I actually wrote a module who generates data and I'll send that soon to gerrit [09:56:12] i think that is a better approach than grabbing a set from stat1 :-D [09:56:38] hashar: yes but grabbing a set from stat1 or somehow sending the new code to be run on stat1 has some advantages [09:56:46] which is... we can easily compare stuff [09:56:54] what percentages have risen which have fallen [09:56:57] etc [09:57:00] in the reports [09:57:16] real data is more real than tests :D [14:19:47] morning average_drifter [14:25:29] morning ottomata [14:26:10] morning! [14:34:14] good morning! [14:42:16] moorning! [14:46:15] hello drdee [14:46:19] what's goin on with gerrit ? [14:46:24] why can't I git-review ? [14:46:40] error: The requested URL returned error: 403 while accessing https://gerrit.wikimedia.org/r/p/analytics/wikistats.git/info/refs [14:59:55] drdee: you closed 36722 ? [15:16:57] average_drifter: club mate? [15:19:39] $ TZ=Europe/Bucharest date -R [15:19:40] Mon, 10 Dec 2012 17:19:30 +0200 [15:19:59] average_drifter: where do you get this 9 -> 9:30 thing???! [15:19:59] jeremyb: ? [15:20:09] jeremyb: Google told me [15:20:12] jeremyb: why ? [15:20:14] told you what? [15:20:30] Goold Calendar told me at 9PM [15:21:29] http://i.imgur.com/MJKzf.png [15:22:10] jeremyb: ^^ [15:22:29] oh, you mean 30 mins duration? [15:22:40] i thought you were saying 9pm was equivalent to 9:30 in some other place [15:25:10] $ for TZ in Europe/{Bucharest,Athens}; do zdump -v "$TZ" | egrep -e ' 201[23] ' | cut -d' ' -f 2- | sha1sum; done | uniq -c 2 571cbb5e389fdbe47acda9ca3c7f650722e6cec6 - [15:26:29] so, average_drifter and apergos and paravoid all have the same home TZ for at least the next year. and that's the same as aharoni and nike and ori-l most of the year too [15:27:02] unless maybe nike's not there any more [15:27:13] average_drifter: have you tried club mate? [15:47:24] I know paravoid or apergos are from GR [15:47:29] or Greece :) [15:47:34] I'm from Romania [15:47:42] and uhm, not sure about ori-l [15:47:49] we probably are on the same timezone [15:49:46] average_drifter: i proved it already ;) (see my paste above) [15:49:46] https://gerrit.wikimedia.org/r/37834 [15:49:53] jeremyb: :) [15:50:09] you sha1-ed our hours maybe ? [15:50:33] or just timezones I guess [15:50:50] oh well [15:50:54] drdee: https://gerrit.wikimedia.org/r/37834 [15:50:57] drdee: please have a look [15:51:43] hashar: what happens with our gerrit change for the packages after it is approved [15:51:46] ? [15:51:51] does something parse it and run it for gallium ? [15:51:56] someone/something/ [15:52:01] I can't answer sorry, about to leave. [15:52:13] please mail me to hashar @ free . fr and will reply tonight [15:53:44] ok [15:59:02] average_drifter: http://dpaste.com/843578/plain/ [16:53:21] jeremyb: I tried club mate yes, I was in Berlin for NYE last year [16:53:30] jeremyb: they had lots of it in there in about every store [16:53:58] jeremyb: but I have a feeling the actual mate is different [16:54:39] I can't fully recall/assess if club mate had any visible effect [16:54:45] in terms of awakeness :) [16:59:35] jeremyb: did you notice any ?\ [17:22:59] average_drifter: i don't think i've tried either. but seen a lot of both [17:23:43] average_drifter: also in berlin and CCC camp and hackerspace in vienna and I think in DebConf1[01] [17:44:48] average_drifter: did we already merge the other changeset? [17:50:51] average_drifter: shall I merge https://gerrit.wikimedia.org/r/#/c/37607/ ? [17:51:42] ping milimetri [17:51:45] ping milimetric [17:51:52] hi [17:53:02] sup> [17:53:32] I'm working on deployment [17:53:54] something's odd but I don't think it's apache yet so I haven't bothered otto [17:54:13] did you mean to ping me about the changeset above drdee? [17:54:30] no :) just wanted to say hi [17:58:59] :) oh hello. ottomata can I restart kripke or is that a big no-no? [17:59:08] that's cool [17:59:18] dschoon might have more opinions about that than me :p [17:59:47] i think it should always be fine to reboot, it's a dev machine [18:00:54] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [18:02:53] drdee [18:02:55] erosen [18:02:58] ^^ [18:03:45] sorry [18:06:37] drdee: hey [18:06:59] yo [18:07:15] should i merge https://gerrit.wikimedia.org/r/#/c/37607/ ? [18:08:15] drdee: yes please [18:08:24] drdee: then merge this one also please https://gerrit.wikimedia.org/r/#/c/37834/ [18:08:55] merge conflict: https://gerrit.wikimedia.org/r/#/c/37607/ [18:09:01] how can this happen every time? [18:09:21] someone commited in the meantime ? [18:09:42] i dunno [18:10:43] drdee: are there many files conflicting ? [18:10:53] dunno [18:11:04] drdee: can I merge with master first and then I can notify you to try again ? [18:11:11] sure [18:11:13] ok [18:22:25] ottomata, quick question about adding remote gerrit destination for wikistats on stat1, the full command is [18:23:21] git remote add origin ssh://ezachte@gerrit.wikimedia.org:29418/analytics/wikistats.git [18:23:30] (for erik zachte) [18:36:06] ottomata ^^ [18:39:45] um, 'origin' would be the name of the remote [18:39:49] and, i'm not sure if you want to change that [18:39:59] since it probably is set to the readonly url [18:40:03] so that would be 'erik' [18:40:04] so you can name that whatever you want [18:40:07] sure [18:40:14] after you do that [18:40:14] this is about allowing erik to push from stat1 [18:40:17] right [18:40:27] so after you add a remote for his url [18:40:32] if it was named 'erik', you could do [18:40:36] git push -u erik [18:40:43] what is -u ? [18:40:46] upstream [18:40:52] that's telling push which upstream remote to use [18:40:53] k [18:40:54] by name [18:48:39] ottomata, is the packet loss issue for blog.wikimedia.org related to the overall packet loss issue? [18:48:47] yes and no [18:48:53] yes, because unsampled causes packet loss [18:49:01] so if you want an unsampled stream to run webstats collector [18:49:09] than you'll get packet loss [18:49:13] well, i'm not sure that unsampled causes it [18:49:16] but that's what happens righ tnow [18:49:32] webstatscollector was always running unsampled, right? [18:50:54] hm, yes it was [18:50:55] interesting [19:23:46] ottomata: we can have a webstatscollector run side by side with a simple filter (a python one-liner filter which checks for blog.wikimedia.org on URL field) [19:24:07] ottomata: we multiplex the stream in two parts => one for webstatscollector and one for the simple filter above ^ [19:24:27] ottomata: after a period of say... one hour we compare what webstatscollector has dumped to disk and what the simple filter has found [19:24:58] ottomata: then we can find out if the cause is packet loss [19:25:06] IMHO this would be a way to find out [19:26:19] if there was a way to artificially create an interface (like the loopback) with a given packet loss then we could also test that one(but it'd prolly require root rights) [19:30:53] i don't think webstatscollector or udp-filter -o is causing loss [19:31:04] i get packet loss on an03 where I am running a few unsampled processes [19:31:16] its just running unsampled streams are causing problems [19:31:25] not sure why at the moment [19:31:29] :( [19:31:36] i'm trying to understand how udp2log gets its data to each process [19:31:45] but it is complicated code that is not well documented [19:39:43] ottomata, can you install git review on stat1? [20:01:20] drdee: autom4te.cache was blasted from the git repo :) https://gerrit.wikimedia.org/r/#/c/31858/ [20:01:28] build succesful [20:03:42] mmm maybe also remove aclocal.m4, config.log and config.status [20:26:12] hey guys (ottomata, milimetric, average_drifter) can you please update / close / add your asana tasks? i am about to generate the weekly report [20:27:46] I'm up to date drdee, can't close anything right now [20:27:51] deployment is a bitch [20:27:54] ok [20:52:49] drdee, sorry, meant to tell you that was done [20:52:52] git-review on stat1 [20:52:55] hey ottomata [20:52:58] heya [20:53:02] ty ottomata! [20:53:03] i still can't figure out my problem [20:53:04] hey there :-] [20:53:16] me neither! so many problems [20:53:18] mind taking a look? [20:53:19] :) [20:53:21] but, lemme take a break from mind [20:53:23] what's up? [20:53:32] mind? [20:53:33] mine* [20:53:35] mind=mind [20:53:41] mind=mine [20:53:41] ok, so here [20:53:43] http://test-reportcard.wmflabs.org/dashboards/reportcard [20:53:50] that's serving what I want to serve in DEV [20:54:06] (it's still broken, we can ignore that for now because that seems to be a problem with node and how Dave configured it) [20:54:16] ok [20:54:19] you want it in prod mode? [20:54:26] but http://dev-reportcard.wmflabs.org/dashboards/reportcard is worse, it's not loading the same way [20:54:45] this is confusing to explain, hangout? [20:54:52] um, ok, just got to a cafe [20:55:14] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [21:39:35] ottomata, any news regarding the packet loss issue? [21:39:56] nope, been helping milimetric with his thing [21:40:00] i have no idea at the moment [21:40:05] ty! [21:40:06] just got off call with dan [21:40:12] which was good [21:40:14] aight [21:40:18] my eyes were going boogly from reading tim's C++ [21:40:24] :D [21:40:25] i think it is very good code, just not well documented [21:40:27] I can read C no problem [21:40:31] C++ I am very rusty on [21:40:46] send him an email [21:40:47] ? [21:40:50] maybe [21:41:00] i'm not confused enough about the code yet [21:41:06] i was able to get the general gist of how it works [21:41:09] if not follow it exactly [21:41:11] k [21:41:17] which is what I needed [21:41:30] I wanted to determine if each process was reading from the socket or network buffer somehow [21:41:30] it isn't [22:22:22] hey average_drifter, drdee [22:22:24] i just noticed something [22:22:45] i'm only getting dropped packets on an26 [22:22:54] when I run unsampled udp-filter -o [22:24:14] so udp-filter is causing it ? [22:24:18] i tink so [22:24:22] udp-filter -o [22:25:01] ottomata: can I get an input and expected output please ? [22:26:49] ? [22:27:16] of what? [22:28:05] ok [22:28:07] and [22:28:19] on locke, we're using domas' 'filter' code for this, right? [22:28:32] i'm trying filter on an26 instead of udp-filter -o [22:28:37] and it does not cause dropped packets [22:29:37] hmm [22:30:02] so, what were the changes needed that caused us to use udp-filter instead of just modifying 'filter'? [22:30:10] can't we just use filter and have it accept blog urls? [22:31:04] ottomata: well we discussed that filter and udp-filter are doing similar things so we decided to merge them [22:31:20] ottomata: are you sure udp-filter is dropping packets or just filtering out some lines ? [22:31:48] ottomata: ok, please provide me with an unsampled log file so I can test and see what's going on [22:31:59] ottomata: I'll compare filter and udp-filter -o output to see how they differ [22:32:35] ok [22:34:19] average_drifter [22:34:22] unsampled log sample [22:34:27] on stat1:/a/otto/unsampled.log [22:40:07] ok [22:41:08] ottomata: is the old filter present anywhere on stat1 ? [22:41:47] now it is :) [22:41:53] /a/otto/filter [22:42:25] did you actually make logical changes to filter stuff with udp-filter -o? [22:42:29] or just port it over to udp-filter? [22:43:32] ottomata: for just -o , udp-filter should output the same amount of lines as filter [22:43:49] drdee: this is true, right ? ^^ [22:43:57] so there are no functional differences between filter and udp-filter -o [22:43:58] ? [22:44:22] ottomata: there should be none, I'll have to refresh my memory on it though, I'll look to see what's going on [22:44:28] thanks for the dump and the filter binary [22:44:29] ok cool [22:44:33] um, i'd like to recommend [22:44:37] yes ? [22:44:37] if there are no functional differences [22:44:41] then let's not use udp-filter [22:44:52] there's no reason to port it over if we're not improving it in some way [22:45:08] especilaly since it is kind of a completely different feature set than what udp-filter was built for [22:45:43] its just more to debug and troubleshoot and worry about [22:45:49] filter is already running happily on locke [22:45:56] there's no reason to stop using it if it is working [22:46:10] well there are some advantages of having filter functionality in udp-filter . udp-filter has many more switches and it can be used to perform more complex filtering than just filter itself [22:47:05] but you can't use those switches in combination with -o [22:47:06] right? [22:47:20] udp-filter -o == filter [22:47:27] ? [22:47:48] I think you can [22:48:09] don't see how, the field positions are hardcoded [22:48:09] some of them can be used in conjunction with -o [22:48:17] -o changes the output format [22:48:21] what SHOULD happen [22:48:23] is [22:48:30] udp-filter -whatever -flags -you -want | filter [22:48:39] unix pipe method [22:48:45] wouldn't this add complexity and lead to performance loss ? [22:48:47] keep functionally separate programs separate [22:49:15] piping? don't think so, unix is kinda built this way [22:50:15] maybe, but, the fact that udp-filter -o causes dropped packets is not good [22:50:23] yes but that can be fixed [22:50:27] but why should we? [22:50:32] why do we want to touch a working system [22:50:39] we shouldn't mess with stuff unless there is a good reason to [22:50:50] i don't see a good reason to right now, other than we feel like it would be more elegant [22:50:59] (I don't think it would, but I can see the argument for it) [22:51:20] testing new code is one of the reasons we haven't deployed this yet [22:51:30] if we don't have to modify filter, than that is a huge piece that we don't have to test before we deploy [22:55:18] ottomata, but we had to fix filter anyways, like adding bot filtering [22:55:38] ah ok, so there is different functionalility [22:55:42] well, whatever changes have been made [22:55:47] are causing dropped packets :p [22:55:55] i'd still say change as little as possible [22:55:58] crap [22:56:03] k [22:56:13] I got everything I need to fix it [22:56:24] average_drifter, i'm not sure how you do [22:56:30] average_drifter: so what is an easy way to improve performance is the if sequence for domains [22:56:34] you won't be able to detect drop packets unless you can run this on a live stream [22:56:41] inside of udp2log [22:56:41] make sure that wikipedia is the first clause [22:58:20] spetrea@stat1:~/webstats_debugging$ export _LINES=20000; cat unsampled.log | head -$_LINES | ./usr/bin/udp-filter-static -o | wc -l; cat unsampled.log | head -$_LINES | ~/filter | wc -l; [22:58:24] 4190 [22:58:26] 2902 [22:58:33] first is udp-filter-static output, second is filter output [22:59:08] right, ok so there are differences that you should be able to resolve [22:59:11] but, what i'm saying [22:59:27] is that this isn't just a problem with the output being different [22:59:32] this is a problem of performance profile being different [22:59:58] and that is a much trickier thing to resolve [23:00:04] that's why i'm recommending you stick with filter.c [23:00:12] and make as few changes as possible [23:00:21] but the thing is we need to add new domain support [23:00:21] a [23:00:23] nd [23:00:23] t [23:00:30] domas and Tim whoever else have already dealt with the performance issues [23:00:38] so by recoding it in udp-filter, you are going to have to redo their work [23:00:42] and their work is not well documented [23:01:08] wait [23:01:13] I was a bit sure this would happen [23:01:22] so by geocoding them on the fly we are provoking packet loss [23:01:22] right ? [23:01:32] you are geocoding them? [23:01:33] probably [23:01:35] did filter do that? [23:01:38] no [23:01:48] but we shouldn't geocode on the unsampled stuff [23:02:14] does -o geocode? [23:02:19] no [23:02:20] -g -b country does [23:03:14] ooook, so then that is not our problem [23:03:46] average_drifter; can you port the new domain support to filter.c ? [23:03:50] yes [23:03:51] udp-filter -o [23:03:58] that causes dropped packets on unsampled stream [23:04:15] why average_drifter? [23:04:27] are you doing a lot of string copying ? [23:04:32] not udp-filter -g -b ( not saying it doesn't, just that I haven't tested it) [23:04:38] we are just testing udp-filter -o [23:04:42] on an unsampled stream [23:05:00] soooooo, jajajaja, i'm looking at filter.c now, and its got a bunch of hardcoded stuff [23:05:06] maybe all 3 of us should look at the source ): [23:05:06] why not just make the hardcoded stuff configurable [23:05:10] and not change any functionality [23:05:26] drdee: there is string copying going on, yes [23:06:31] from a brief read of filter.c [23:06:52] it looks like all it does is tokenize the line, and if it has required fields, prints a line [23:07:34] well a bit more, it does filter on domains [23:07:48] right, i guess that counts as a require field [23:07:49] or whatever [23:07:50] but yeah [23:07:50] so [23:07:57] if you make the domains and the dup_ips configurable [23:08:01] then the rest of the code can stay the same [23:08:12] ?? [23:08:21] as in [23:08:22] that was already part of the original filter.c [23:08:30] right, but it is hardcoded [23:08:32] and you want to change it [23:08:37] and you might want to change it in the future [23:08:53] making this configurable is hard and not worth it [23:08:59] allowing the IP dupes and filter domains to be specified at runtime is not going to affect the performance? [23:09:13] ha? what? how is that hard? much less hard than porting to udp-filter [23:09:27] just parse cli flags [23:09:31] with getopt or whatever in C [23:09:35] no [23:09:43] no? [23:09:50] for example adding support for blog.wikimedia.org requires extra business logic [23:09:54] you cannot just add that domain [23:09:54] why? [23:10:10] you don't want to just filter for that domain and then print the path? [23:10:12] what is the title when visiting blog.wikimedia.org ? [23:10:50] everyhting after the / [23:11:00] right [23:11:10] so that would be NULL [23:11:39] we need to figure out what introduced the performance regression [23:11:59] ah i see, it does have specific stuff for /wiki/ [23:11:59] ok [23:12:01] so [23:12:05] here's another question [23:12:06] i am looking at collector-output.c and i think it's conceptually very similar to filter.c [23:12:14] why do we need to use webstatscollector for blog.wikimedia.org? [23:12:35] because we have been asked a million times to provide page views metrics [23:12:48] but why dod we need to do it with webstatscollector? [23:12:52] how else? [23:13:00] brb, thinking about valgrind and profiling next. We will fix this for sure [23:13:08] you want a page view count of each page on blog.wikimedia.org [23:13:09] right? [23:13:15] yup [23:13:25] why does webstatscollector have to do this? [23:13:33] webstatscollector look slike it was built for mediawiki [23:13:37] why try to bend it for wordpress [23:14:08] no, we are not trying to bend it to wordpress [23:14:14] we are just consuming udp2log [23:14:26] you are though, you have to change business logic in webstatscollector to make it work with blog urls [23:14:33] the blog traffic is pretty small [23:14:37] write some python script to do it [23:14:40] i think we should focus on what introduced the performance regression [23:15:05] udp-filter -d blog.wikimedia.org | pageview_counts.py —period=hourly [23:15:28] or perl [23:15:30] or awk and bash [23:15:32] or whatever [23:15:35] or C if you want [23:16:03] we could do that as well but the idea was that the analtyics team would stop introducing adhoc solutions ;) [23:16:13] and bring some sanity to this [23:16:30] if it turns out that the performance regression is hard to fix then let's go with [23:16:35] i think modifying udp-filter to do something completely different than everything else it does is not sane [23:16:36] udp-filter -d blog.wikimedia.org | pageview_counts.py —period=hourly [23:16:59] i think modifying business logic of mediawiki stats code (webstatscollector) is not sane either [23:17:02] i don't think we did that, we just added a different output format as an option [23:17:09] no, its completely different [23:17:31] all of the other flags just filter on field content, or slighly modify fields (geocoding, anonymizing, etc) [23:17:41] they dont' change the format [23:17:42] actually [23:17:42] wait [23:17:46] filter doesn't even count urls [23:17:49] you could do this with awk [23:18:53] we just cleaned up all those awk scripts ;) [23:19:13] yeah, but abstracting common features [23:19:16] this is not a common feature [23:19:19] by* [23:19:46] also, this is a temp bandaid solution anyway too, right? eventually kraken will do this anyway, no? [23:19:49] yikes, i gotta run [23:20:14] yeah,you are right, i will see if we quickly can fix the performance thing and else we will go your route [23:21:28] ok cool, hope it works out, i think its going to be hard to test the perf stuff though [23:21:33] ok, talk to you boys tomorrow [23:21:35] latas [23:27:09] I wish ottomata had given some % packet loss comparison between filter and udp-filter -o [23:27:13] it would've helped a lot to know that [23:28:03] but can manage with what I have also