[03:40:41] http://www.thevenusproject.com/online-community [13:18:21] @labs-project-admins cvn [13:18:24] @labs-project-users cvn [13:18:24] Following users are in this project (showing all 6 members): Azariv, Krinkle, Novaadmin, Sactage, Matanya, Seahorseruler, [14:06:51] how do i roll back a commit in git? [14:07:26] reverting and branching seems like to much, any better metho [14:07:29] d [14:09:04] matanya: I'm no expert, so I don't know of other methods. [14:09:42] thanks [14:14:35] matanya: something that has already been merged? in that case, using the 'revert' button in Gerrit is probably the easiest solution. [14:14:48] that's the same as calling git revert, afaik [14:15:08] valhallasw: no, not merged yet [14:18:04] matanya: ok, so you want to amend the earlier commit? [14:18:41] if it's the most recent, git add , git commit --amend [14:19:04] if it's older, you need to use rebase: git rebase --interactive HEAD~ [14:19:08] then follow the instructions you get [14:24:49] thanks valhallasw [18:35:16] Ryan_Lane, YuviPanda, are you both available to talk about proxy packaging? [18:35:30] in a little bit [18:35:30] i am! [18:35:31] in a meeting [18:35:35] ok [18:35:46] YuviPanda, can you stay awake until Ryan's free? [18:35:54] i'll be up for 2 more hours [18:35:54] so [18:35:58] i think that's a yeah [18:36:15] cool [18:42:51] Coren: fyi, https://bugzilla.wikimedia.org/show_bug.cgi?id=48894 will put ~3T of data on the public datasets on labs [18:43:13] hopefully that's not too much [18:43:32] YuviPanda: It's not _too much_ but I want to make a partition for it and share it readonly. [18:43:52] Coren: how much work would that be? [18:43:59] Coren: also if it is readonly, how can the rsync work? [18:44:04] will root still be able to write toit? [18:44:13] And put it in /public [18:44:29] Not a lot. ~30m of time. [18:44:45] You're planning on having it updated /from/ the project? [18:44:58] Otherwise, I can give rw to a specific project/box that can do the rsync. [18:45:18] Coren: oh, *share* it readonly [18:45:19] nevermind [18:45:26] Coren: makes sense. [18:45:37] mount it as httpfs read-only [18:45:39] boom. [18:45:53] (Even if nobody has write permission, NFS is more efficient when it knows something is readonly) [18:46:02] Coren: yeah, makes sense [18:46:23] Coren: /public/datasets/pagecounts? [18:46:33] Lemme go make a lv for this now. [18:46:41] Coren: sweeet! [18:47:22] YuviPanda: Do you have an estimate on how many files comprise that data set? Is it a lot of small files or a couple really big ones? [18:47:30] Coren: looking [18:48:14] Coren: one per day [18:48:30] Coren: ~100M each [18:48:57] pagecounts is one per hour, no? [18:49:07] oh wait [18:49:46] johang: you're right, i cna't read again [18:49:58] Coren: lots of ~100M files [18:53:56] Oh, hah. I was confused by the filesystem numbers I saw until I noticed I asked for a 5G filesystem. Oops. :-) [18:55:19] Coren: heh, sounds a bit small :P [18:57:51] YuviPanda: What needs write access? [18:58:06] Coren: moment, looking. [18:58:18] Coren: the current rsync runs as root [18:58:26] Yeah, but from where? [18:58:31] finding [18:59:46] * Coren put 5T aside for this, to leave a bit of elbow room. [19:17:59] Coren: out of curiosity I looked at when the Rules page was created and looked for announcements of it, and couldn't find any to labs-l [19:18:51] Coren: while perhaps mostly an instantiation of how Tool Labs is supposed to be used, I would've liked to see it announced, both as a note that some rules are in place _and_ as a reminder of how it's supposed to be used [19:21:59] Coren: so what do I mount the new partition from? [19:22:07] current one is mounted as device => 'labstore1.pmtpa.wmnet:/publicdata-project', [19:22:10] s/as/from/ [19:22:19] as... glusterfs?! [19:28:26] Nettrom, it just superseeded the numerous drafts that were badly indexed and peppered around wikitech and mediawiki.org. :-) [19:28:49] It should be labnfs.pmtpa.wmnet:/pagecounts [19:28:59] YuviPanda: ^^ [19:29:06] Coren: okay [19:29:30] Coren: can you tell me if labstore1.pmtpa.wmnet:/publicdata-project is glusterfs or ntfs? [19:29:32] gah [19:29:33] nfs [19:29:54] It's NFS [19:30:47] Coren: hmm, if you look at misc::download-gluster, it mounts /publicdata-project as glusterfs, and it seems to be still working [19:35:42] The gluster server exports every gluster volume also as NFS3 [19:35:53] Coren: also, any particular NFS mount option reccomendations? bg,rsize=8192,wsize=8192,timeo=14,intr? [19:35:58] ah [19:35:58] ok [19:36:19] port=0 required [19:36:49] bg,rsize=8192,wsize=9182,timeo=14,intr,port=0? [19:37:09] have only mounted something as NFS before once in my lifetime, and it wasn't a server :P [19:37:22] You almost certainly want hard too, I'm not sure if that or soft is the default. [19:37:41] bg,rsize=8192,wsize=9182,timeo=14,intr,port=0,hard [19:37:44] anything more? [19:37:48] 8192, not 9182 [19:38:08] andrewbogott, YuviPanda: ok. I'm out of the meeting [19:38:13] and have lunch and such [19:38:26] going to have? or had? [19:38:29] Ryan_Lane, lunch w/a keyboard? [19:40:19] YuviPanda: I know I keep asking you this, but… the actual proxy is already packaged and puppetized, right? [19:40:35] andrewbogott: yes [19:40:38] yep [19:40:54] so, it should be relatively easy to package this [19:41:41] just need to install the code somewhere and make an init script [19:41:47] so [19:41:53] I looked at the code a bit this morning… I was stopped by the flask-sqlalchemy plugin [19:42:01] it uses flask, flask-sqlalchemy, sqlalchemy, python-redis, and the debs for all of those are out of date [19:42:05] heh [19:42:09] wonderful ;) [19:42:09] flask-sqlalchemy doesn't exist, afaik [19:42:18] YuviPanda: how much of an issue is 'out of date'? [19:42:22] so i was thinking why not just bundle all the dependencies? [19:42:32] andrewbogott: I don't know, haven't tried running it with old ones yet :D [19:42:39] YuviPanda: that's very "not debian" [19:42:50] OK… do you mind running some tests to figure out which packages we actually are missing? [19:43:03] If it just comes down to flask-sqlalchemy then we can just package that separately. [19:43:03] I probably can't do that today [19:43:10] * andrewbogott nods [19:43:22] Ryan_Lane: is that the only reason? [19:43:24] Will failures be obvious? Is it just a question of running vs. not running? [19:43:37] I wonder if any of these are available in the openstack repos [19:43:45] andrewbogott: well, with flask and sqlalchemy, possibly not. [19:44:08] andrewbogott: the versions on the repos for flask is 0.8, and current is past 1.0, with multiple security and API fixes. [19:44:15] python-redis that comes with ubuntu is too old? [19:44:30] Ryan_Lane: LTS one doesn't implement StrictRedis, just Redis, which has inconsistencies [19:44:39] Ryan_Lane: is there an index for the openstack repo someplace? I'm googling, don't immediately see it [19:44:47] andrewbogott: ubuntu cloud archive [19:45:10] I think we're going to target havana for the move to eqiad [19:45:13] so going with that is safe [19:45:17] Ah, right, 'the cloud' [19:45:22] :D [19:45:49] DEAR GOD, DOWNLOAD.PP USES TABS!!!! [19:46:19] YuviPanda: you realize our repo is totally fucked style wise, right? :) [19:46:30] yeah, but first time I'm encountering one [19:46:32] I think lots of old puppet code uses tabs [19:46:53] YuviPanda: if you're looking at download.pp, note https://gerrit.wikimedia.org/r/#/c/90760/ [19:47:34] anyway… action plan: [19:47:44] I will figure out what versions of which packages are available in the cloud archive. [19:48:20] if they aren't using flask, we should consider switching to whatever is standard for openstack [19:48:21] Yuvi (if I haven't emailed before morning) will try reinstalling with stock versions of the dependency packages and figure out which ones we can't live with [19:48:43] Ryan_Lane: Yeah, ultimately this should all be written using Oslo but… trying to be expedient for the moment :) [19:48:47] yep [19:49:00] anyway, once we know what we need I can do the debianizing bits. [19:49:06] YuviPanda, sound OK? [19:49:08] operating on the db replicas what is the right table to find pages using a specific template? [19:49:10] 'tis a bit frustrating, y'know [19:49:15] YuviPanda: ? [19:49:23] i.e. what links here, in the dbs? [19:49:27] well, it works now. why not bundle the dependencies? [19:49:46] because we don't want to be murdered by our colleagues [19:49:56] YuviPanda: I understand why you're frustrated, but on the other hand, it's pretty normal to have to consider the platform that your software is targeting. [19:49:58] the entire parsoid team is dead? :P [19:50:06] they don't bundle anything [19:50:31] well, they use a lot of things from npm that don't need to be debianized [19:50:41] YuviPanda: hey! [19:50:41] if there's a node exception, I'll happily rewrite this in node to not have to go through the versioning dance [19:50:54] if this was being used in production we could use git-deploy [19:51:10] but realistically we tend to require dependencies be packaged [19:51:38] Even if we used git-deploy we wouldn't want to host and deploy the whole dependency cascade would we? [19:51:42] and if we're being good openstack citizens (and we should be), then it should definitely be the case ;) [19:51:45] They seem like unrelated issues [19:51:45] andrewbogott: indeed [19:52:38] hey gwicke. how do you get your npm dependencies on the cluster? [19:52:47] Anyway… I'm happy to do whatever recoding/packaging is needed to get this into an installable deb. I just need to know what's needed. [19:53:06] andrewbogott: well, requirements.txt on the repo lists the current set of modules that run [19:54:40] andrewbogott: if you want me to downgrade them to current available ones on LTS ubuntu, and see what breaks, I can probably do that, but no idea when [19:54:54] Yes please, if I don't get to it first. [19:55:02] I just don't know that I'll be good at noticing breakage. [19:55:15] Hm, cloud archive docs say to run 'sudo add-apt-repository cloud-archive:havana' which is not a thing I've ever done or heard of [19:55:17] and doesn't work. [19:55:21] I wish it would just tell me the damn url [19:55:25] does it matter if it breaks as long as it runs on older libraries? :P [19:55:36] andrewbogott: we have puppet config to add the cloud archive [19:55:40] see openstack.pp [19:55:49] YuviPanda: of course it does ;) [19:55:52] andrewbogott: honestly, I think debian packages for libraries like flask which move really fast is not a good idea [19:56:01] and is not something that I like doing [19:56:05] I disagree [19:56:09] Them moving really fast is exactly /why/ we want debs [19:56:10] I see no benefit to that, for *web* apps. [19:56:19] openstack ships its wsgi packages [19:56:22] completely agree for system services as such. [19:56:31] so that upstream maintains backwards compat [19:56:36] and gives security updates [19:56:40] well, in that case, I'll be happy if someone packages the list of modules in requirements.txt :) [19:57:02] the idea is to use what the distro gives you, or target the software you're building against [19:57:13] in this case you either use the debian defaults, or something in the cloud archive [19:57:33] Ryan_Lane: well, all the major wsgi deployment methods support virtualenv / pip [19:57:48] and yes, this is a pain in the ass as a dev, but it's necessary for properly maintaining a service [19:58:08] no sane person deploys using pip/virtualenv [19:58:32] indeed, which is why this is the first time I Mention it :) [19:58:50] and I'm not suggesting it, mostly because I don't know if pip even does signature validation [19:58:55] the alternative is to use supported versions of libraries ;) [19:59:30] Ryan_Lane: I am! Supported version of flask is 1.0+ [19:59:40] I mean [19:59:40] I mean distro supported versions :) [19:59:41] 0.10+ [19:59:53] YuviPanda: we package them in a repository and push them out with git-deploy [20:00:09] we'd like to provide a deb though [20:00:17] and later debs for each of the dependencies [20:00:36] YuviPanda: don't use npm/node as a model for how things should work :) [20:00:39] heh [20:00:46] it's nearly as bad as ruby [20:00:51] gwicke: deb repository? [20:01:14] debian packages, yes [20:01:17] heh, yeah, 'oh, just compile your version of ruby!' is... not an answer, yes [20:01:48] YuviPanda: welcome to the crappy part of devops ;) [20:02:25] Ryan_Lane: you can check me into a mental asylum if i ever say 'what is wrong with using RVM? it is flexible!' [20:02:31] hahaha [20:03:09] this entire discussion, however, is the crappy part. not being able to use bleeding edge because it puts too heavy a burden is part of the process :) [20:03:11] * valhallasw really wants pip2deb to exist [20:03:17] valhallasw: it doea [20:03:19] *does [20:03:34] Ryan_Lane: I could only find a tool that lets you convert seperate packages [20:03:46] not one that actually uses dependency information to package multiple packages [20:03:50] ah [20:03:51] right [20:03:52] yeah [20:04:10] Well, seems like once it's in a .deb it should be apt's job to manage dependencies [20:04:13] that doesn't really solve the issue, though. because then you still need to maintain all the deps yourself [20:04:28] the point of using distro versions is that someone else handles that for you [20:04:40] and does it in a stable way [20:05:22] OK, let me rephrase what I would like. A tool that takes a pypi package name, checks dependencies, apt-get installs them if possible, and if not, generates deb packages locally and installs those [20:06:24] which means mixing and matching several existing tools, I guess. [20:07:12] valhallasw: just have it automatically convert all of the things in PyPI to deb [20:07:59] andrewbogott: i just tested it. python-redis is too old, and Flask-SQLAlchemy doesn't exist [20:08:30] YuviPanda: Any idea what's needed to make it work with the older python-redis? [20:08:49] andrewbogott: i can do that trivially, although with a sad face :P [20:08:54] heh [20:08:58] make it so! :D [20:09:05] okay? [20:09:06] :P [20:09:07] hahaha [20:09:09] If it's trivial to require fewer packages… for me that'd be a happy face [20:09:19] But I spent 10 years in the backwards-compatibility business :) [20:09:19] jump-through-my-hoops-face! [20:09:46] andrewbogott: of course, everything *looks* okay :D [20:09:49] well, we're going to target oslo in the future, which should make all of this a lot easier [20:10:09] and maybe use one of the existing APIs and write a driver [20:10:12] SadFacePanda: When you develop PC software you have to target the machines and OS your customers actually own, it doesn't really work to write software for the cutting-edge releases and just expect your customers to upgrade everything... [20:10:21] What I'm saying is: you kids don't know how good you have it! *grumble* [20:10:25] :D [20:10:34] andrewbogott: or you statically link everything and send them a 13MB executable... [20:10:45] no! bad panda! [20:10:54] what, not enough diskspace? [20:11:04] "well, that's more floppy disks, we don't want to spend that much..." [20:11:06] That only works if you aren't, like, interacting with the OS in any interesting ways [20:11:08] bundling things leads to a system of security fails [20:11:28] oh. these 20 applications have a vulnerable version of library x [20:11:53] solution: write everything in C. ALl the dependencies are so old anyway [20:11:59] :) [20:12:11] bonus: No security vulnerabilities, since, C, duh! :P [20:12:27] * YuviPanda stcpys things around [20:12:35] it's not a security vulnerability if you're working in ring 0! [20:14:14] andrewbogott: can you test it out with the wikitech test instance, to see if it errors/ [20:14:15] ? [20:14:24] andrewbogott: i seem to have lost all my bashhistory, so my curls are missing [20:14:31] yep [20:14:34] and will need to go back and re-read docs to type them all again [20:14:35] andrewbogott: thanks [20:14:42] andrewbogott: running on same host/port, but this time without pip [20:18:56] YuviPanda: Looks ok to me [20:19:10] andrewbogott: I'll submit a patchset in a while, but consider python-redis to be a nonissue [20:19:16] just flask-sqlalchemy [20:20:37] Looks like there's no flask-sqlalchemy in the havana repo [20:20:46] So… Ryan_Lane, you think just pip2deb for the time being? [20:21:08] Or should I create a gerrit repo and build a deb from there? [20:23:43] andrewbogott: btw, the nginx package that is running - there's no gerrit repo for it [20:23:48] the package, that is [20:24:02] Yeah, that was going to be my next question [20:24:28] andrewbogott: i'm not sure where you built it, but we put it on labsdebrepo and installed it via that [20:24:36] * andrewbogott nods [20:28:01] well, we could just bundle it all in the same package for now if we're going to fix this [20:28:08] oh [20:28:16] if it's just flask, let's pip2deb, then [20:29:03] Ryan_Lane, and (same question as for nginx) how do you feel about it living only in the project storage and getting installed via labsdebrepo? [20:29:07] Vs. living on brewster? [20:32:24] if im using mysql workbench and i want to read replicas, the mysql hostname should be labsdb1001? [20:34:42] andrewbogott: I'm fine with it for now [20:34:46] nvm, enwiki.labsdb, got it thanks [20:34:50] eventually we may want to use the same version in production, too, though [20:34:56] depending on the feature set [20:35:15] the only reason I'd say let's not use it right now is that we have a custom logging module for nginx [20:37:43] Well, also I buit it myself which is not ideal [20:37:47] *built [20:38:34] I need to go to the apple store before my power cord starts shooting sparks. Back in a bit... [20:40:46] Ryan_Lane: we can't use the version in production because we need lua features that aren't in the version in prod :D [20:41:08] I'm saying we'd use the labs version in prod ;) [20:41:57] Ryan_Lane: that'll be great, because SPDY! :P [20:42:18] we wouldn't enable SPDY just yet [20:43:28] sure, but it makes it easier [20:43:34] this already has SPDY enabled [20:45:39] I just created a new tool but I can't write to /data/project/toolname as me (i.e. as jarry1250) despite the fact that I'm in the relevant service group. Is there usually a delay? [20:46:06] */data/project/toolname/public_html [20:47:23] Coren: ^ [20:47:39] jarry1250: what's the toolname? [20:47:45] wmukevents [20:47:49] YuviPanda: [20:58:12] Change on 12mediawiki a page OAuth was modified, changed by Jaredzimmerman (WMF) link https://www.mediawiki.org/w/index.php?diff=806416 edit summary: /* Use cases */ [21:01:56] YuviPanda: yeah, for when we're ready to test SPDY [21:02:16] to really find SPDY useful we'd need to combine domains and such [21:02:40] jarry1250: hmm, ownership seems fine, not sure what is the problem. will have to wait for Coren. sorry! [21:02:42] Ryan_Lane: true. [21:02:58] * Coren will be back with full attention in ~10m [21:03:18] Ryan_Lane: still, can't hurt :D [21:04:00] yeah [21:04:08] there's features in the newer versions we'd want too [21:04:12] yeah, it's SPDY/2 [21:04:16] rather than 3 [21:11:52] What's the best way to determine the revision delta, and if it is a revert, from the repdb? [21:21:22] a930913: the revision checksum will enable you to find identify reverts, but that doesn't catch partial reverts, which I suspect you can perhaps identify through the edit comment [21:21:36] I mean "identity reverts" [21:23:04] I suspect you can calculate revision delta by comparing to the length of the parent revision, but note that sometimes the parent is deleted [21:27:49] * Coren is back. [21:28:08] jarry1250: Can you give me the path you have difficulty with? [21:30:54] Nettrom: I know how I /can/ do it, I was asking if there was a "best" way to do it. I mean I could run a query for every checksum, but that would be silly. [21:55:49] why is it that if i connect to the enwiki replicas and run something like [21:55:56] select * from templatelinks where tl_title like 'Cite_doi'; [21:56:18] it takes 2 minutes 3 seconds, but if i were to call the api, it would happen much faster? [21:58:31] coren: It works now, thanks :) [22:55:13] Hi, I want to use Labs to host an .ics calendar file of WMUK events, but we block the Google spider, right? [22:55:46] jarry1250: http://tools.wmflabs.org/robots.txt 404s [22:55:46] so... [22:55:46] I guess not? [22:56:01] I thought I reaad this on labs-l [22:56:22] Otherwise, the fact that merely moving a file onto Labs prompts an error from Google Calendar is intriguing [22:56:24] Ryan_Lane, is there a tool literally called 'pip2deb' or is it something else? [22:56:51] Or just pip followed by py2deb? [22:57:35] I haven't used the pip debian stuff in a while [22:57:38] I'm not really sure [22:58:51] ok [22:58:57] I can certainly pip & py2deb [22:59:39] python-stdeb - Python to Debian source package conversion utility [23:00:30] * py2dsc will convert a distutils-built source tarball into a Debian [23:00:30] source package. [23:01:59] YuviPanda: See Tim L's email of Aug 15 [23:02:13] jarry1250: can you give me a title? [23:02:14] "for the moment, [we] disallow *all* crawler accesses *any- [23:02:14] where*.... So: ... If your tool *needs* to be crawled by search engine bots or [23:02:14] this causes other problems for you, please speak up. [23:02:15] to search for [23:02:27] jarry1250: hmm, is certainly not the case right now [23:02:53] because of robots.txt not existing [23:03:00] unsure if that is intentional or not [23:07:48] Coren: Ring any bells with you? Google just thinks my tool is unreachable. [23:09:34] Yeah, there is a block on crawlers at the proxy level [23:10:08] They were destroying many bots that had expensive dynamically generated content that did DB queries and whose pages had links to further queries. [23:10:15] And taking down the webservers with them. [23:10:24] Coren: Yes, I do vaguely recall the labs-l threads [23:10:37] Is it customisable atm though? Can one opt-in? I assume not. [23:11:04] Not per tool, although I'll probably turn it off for those that use the new scheme. [23:11:17] (Or at least, make it turn-it-off-able) [23:18:40] Coren: Alas Google uses the same UA for Calendar requests, which is annoying because Google crawling the calendar once per day is not going to bring a webserver down (the calendar composition requiring a single API call). [23:19:02] That is... insanely dumb of them. [23:19:36] That said, it's probably okay-ish to unblock Googlebot -- it's probably the most well behaved of the lot. [23:20:23] * Coren does so. [23:21:11] "16.3% of sites suffer from Googlebot Impersonation attacks of some kind." Among those targeted sites, 21% of those claiming to be Googlebot, were impersonators." .. heh [23:28:00] If we get hit by a faux Googlebot, it's a simple matter to just block its IP [23:39:36] renoirb: when naming instances, try to be more descriptive with the names :) [23:39:51] unfortunately the way we're doing things right now, the instance names need to be globally unique [23:39:56] yes, proxy-project-proxy on project-proxy [23:40:00] we'll be changing that at some point, but it's an issue right now [23:40:58] * ^d names his instances instance[0-9] to bug Ryan [23:41:10] it doesn't bother me :) [23:41:20] using good names only makes things better for you [23:41:49] <^d> I wonder if anything in the [a-z]{1} range is taken :p [23:43:06] one way to find out... [23:45:25] try some unicode smileys [23:45:34] mutante: doesn't work [23:45:38] ok:) [23:45:42] we limit instance names to ascii [23:50:31] * anomie|away names instances "inst7", "inst8", "inst9", and, umm, "destroyed" [23:51:30] <^d> "test" [23:51:37] <^d> "test-instance-1" [23:51:39] <^d> "junk" [23:51:44] <^d> "my-instance" [23:51:46] <^d> :)