[01:49:33] <wikibugs_>	 10Analytics-Kanban, 10User-Elukey: Calculate how much Popups events EL databases can host - https://phabricator.wikimedia.org/T172322#3551634 (10Tbayer) PS (after discussing with @JKatzWMF ): That means that it is now fine from everyone's perspective to drop `log.MobileWebUIClickTracking_10742159_15423246`, as...
[05:49:46] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Troubleshoot Wikimetrics "magic button" - https://phabricator.wikimedia.org/T173585#3551782 (10Marostegui) @Ottomata I have updated the connections limit: ``` mysql:root@localhost [mysql]> show grants for 's52262'@'%'\G *************************** 1. row ****************...
[07:38:09] <gehel>	 bearloga, ottomata: from the reading of the backlog, I understand that you have a temporary solution and that ottomata is looking into cleaning up the user / groups for a longer term solution. If that's not the case and you need me help, ping me!
[09:07:01] <wikibugs_>	 10Analytics, 10Operations, 10Ops-Access-Requests, 10Research, and 2 others: NDA, MOU and LDAP (analytics cluster) for Shilad Sen - https://phabricator.wikimedia.org/T171988#3552027 (10Shilad) One follow-up: The Navigation Vectors project [[ https://github.com/ewulczyn/wiki-vectors/blob/master/src/get_reque...
[11:12:58] <wikibugs_>	 10Analytics-Tech-community-metrics, 10Developer-Relations (Jul-Sep 2017): Have "Last Attracted Developers" information for Gerrit (already exists for Git) - https://phabricator.wikimedia.org/T151161#3552414 (10Aklapper) >>! In T151161#3379400, @Aklapper wrote: > Yay, thank you a lot! happy to see that deployed...
[11:36:24] <wikibugs_>	 10Analytics-Tech-community-metrics, 10Developer-Relations (Jul-Sep 2017): Automatically sync mediawiki-identities/wikimedia-affiliations.json DB dump file with the data available on wikimedia.biterg.io - https://phabricator.wikimedia.org/T157898#3020071 (10Aklapper)
[11:36:28] <wikibugs_>	 10Analytics-Tech-community-metrics, 10Developer-Relations (Jul-Sep 2017): Have "Last Attracted Developers" information for Gerrit (already exists for Git) - https://phabricator.wikimedia.org/T151161#2809332 (10Aklapper)
[12:03:02] <wikibugs_>	 10Analytics-Kanban, 10Analytics-Wikistats: Handle long project names in Wikiselector - https://phabricator.wikimedia.org/T173373#3552591 (10fdans)
[13:35:53] <ottomata>	 gehel:  we don't have a temp solution yet
[13:35:56] <ottomata>	 i thought i could do a quick fix
[13:36:02] <ottomata>	 buuuut, the issue was larger than just that
[13:36:05] <ottomata>	 so its not working for bearloga yet
[13:36:12] <ottomata>	 see: https://phabricator.wikimedia.org/T174110
[13:36:28] <gehel>	 damn... seemingly simple problems never are so simple ...
[13:36:51] <gehel>	 we should probably go back to having this script run under his own user for the moment... 
[13:40:08] <ottomata>	 gehel:  ya for the ones that access hive, yeah
[13:40:18] <ottomata>	 the problem is the way we manage groups and users in puppet
[13:40:26] <ottomata>	 we can't easily add system users to real user groups
[13:40:34] <gehel>	 yeah, that's a pain!
[13:40:35] <ottomata>	 because puppet manages them differently than the admin module does
[13:40:43] <ottomata>	 and the admin module will remove the system users from the groups
[13:40:54] <ottomata>	 and for hadoop, its more complicated, because its a distributed system
[13:40:58] <gehel>	 couldn't we manage those system users entirely with the admin module?
[13:41:00] <ottomata>	 so the users have to be realized on certain nodes (the namenodes)
[13:41:16] <ottomata>	 i suppose we could, buuuuut we don't?
[13:41:40] <ottomata>	 that would basically mean that we couldn't use the user {} and groupi {} resources in puppet
[13:41:45] <ottomata>	 or
[13:41:46] <ottomata>	 we could
[13:41:55] <ottomata>	 but only for ones that are not also in admin module groups
[13:42:11] <ottomata>	 HMm, i wonder if that is possible or allowed
[13:42:21] <ottomata>	 could we just put discovery-stats user in admin data.yaml????///?/??
[13:42:39] <gehel>	 it makes the link between resoruces needing that user / group and it declaration less obvious
[13:44:59] <gehel>	 I think that technically it would work (not tested)
[13:45:08] <gehel>	 but we seem to have only real users in data.yaml
[13:45:25] <ottomata>	 yeah, this is not the first time this has come up, but i can't really remember why we didn't do this earlier
[13:45:30] <ottomata>	 we shoudl ask chasemp
[13:45:53] <gehel>	 I seem to remember something rule about no technical user in data.yaml, but really not sure...
[13:57:19] <chasemp>	 if you guys can summarize on a task I'll try to see if I can make sense of it
[14:04:00] <ottomata>	 chasemp: 
[14:04:06] <wikibugs_>	 10Analytics, 10Discovery, 10Discovery-Analysis (Current work), 10Patch-For-Review: Private data access for non-person user that calculates metrics - https://phabricator.wikimedia.org/T174110#3553045 (10Ottomata) Summary for @chasemp:  Analytics needs a way to:  - create system users - have real users in ce...
[14:04:08] <ottomata>	 i added a little more info for you at bottom
[14:04:11] <ottomata>	 https://phabricator.wikimedia.org/T174110
[14:05:04] <wikibugs_>	 10Analytics, 10Discovery, 10Discovery-Analysis (Current work), 10Patch-For-Review: Private data access for non-person user that calculates metrics - https://phabricator.wikimedia.org/T174110#3553049 (10Ottomata) That last bullet there is not as important as the first 3 :)
[14:05:16] <chasemp>	 ah, afaik this is the sticking point right? have system users be placed in real user groups, so that posix group permissions can be used to restrict access to data for both real users and system users
[14:05:33] <chasemp>	 that isn't possible yeah
[14:05:41] <chasemp>	 question tho: what are you restricting?
[14:05:45] <chasemp>	 is it files a certain directory?
[14:06:39] <ottomata>	 yeah chasemp for this case
[14:06:50] <ottomata>	 discovery is trying to automate reports
[14:07:07] <ottomata>	 scripts are launched from stat1005 that query several sources, one of which is hadoop/hive
[14:07:13] <ottomata>	 in ordr to access webrequest data in hive
[14:07:16] <ottomata>	 a user needs:
[14:07:22] <ottomata>	 - to be in the analytics-privatedata-users group
[14:07:31] <ottomata>	 - have an account on the hadoop namenodes (analytics1001,analytics1002)
[14:07:58] <ottomata>	 chasemp:  would it be possible to make a system user in data.yaml without any ssh keys?
[14:08:20] <ottomata>	 i know we've talked about this before, apologies if you've already said no and i've forgotten
[14:09:54] <chasemp>	 two levels to that are: by policy it was mandated no system users polluted admin.yaml, and the mechanism has to manage the group members as a whole or it's not worth much since selectively ignoring unmanaged users is an intractable security issue
[14:09:58] <chasemp>	 I guess what I'm wondering is
[14:10:15] <chasemp>	 you have some system users that operations run under the guise of
[14:10:27] <chasemp>	 and some users who are in particular groups to access that output 
[14:10:31] <chasemp>	 or something like that
[14:10:52] <chasemp>	 and there is now a new process where discovery wants to do some more automated things
[14:11:01] <chasemp>	 but someone cannot access the end state data
[14:11:13] <chasemp>	 and we want to put a system user in a human group to overcome
[14:11:19] <chasemp>	 but if it's teh end state data that cannot be accessed
[14:11:20] <gehel>	 chasemp: until now, those scripts were running as a human user
[14:11:35] <chasemp>	 can we do some setfacl things?
[14:11:42] <chasemp>	 sudo setfacl -Rdm g:groupnamehere:rwx /base/path/members/
[14:11:44] <chasemp>	 etc
[14:11:45] <ottomata>	 chasemp:  the end state data is in hdfs
[14:11:48] <gehel>	 so yes, what you describe is the first part of the issue
[14:12:11] <ottomata>	 but hdfs does have acls...
[14:12:12] <ottomata>	 so
[14:12:13] <ottomata>	 maybe
[14:12:13] <ottomata>	 but we dont'
[14:12:17] <ottomata>	  do that anywhere else
[14:12:26] <ottomata>	 not sure if we should or not, but its worth investigating
[14:12:52] <gehel>	 ottomata: the end, end state is on stat1006:/srv/published-datasets/discovery/ (I might be wrong, I started to look into this only 2 days ago, I'm probably missing some context)
[14:12:54] <ottomata>	 i wouldn't be excited about it unless we could puppetize the acls in some way, at least for users
[14:13:08] <ottomata>	 gehel:  yeah, but this isn't hte first time this has come up gehel
[14:13:14] <ottomata>	 others have had this problem too
[14:14:19] <gehel>	 and to go back to chasemp description of the issue, there is also access to hive that is controlled by group membership, and the technical user needs to have this acces (correct me if I'm wrong)
[14:14:36] <ottomata>	 yeah
[14:14:47] <ottomata>	 acls actually might be the answer here, this is also a limitation of posix permissions
[14:15:02] <ottomata>	 in that you might need multiple groups to be able to read something, but want to restrict others
[14:15:14] <ottomata>	 am reading a little about hdfs acls, i think this might work....
[14:15:25] <chasemp>	 so there is a long standing ban on managing system users via admin.yaml
[14:15:34] <chasemp>	 but there is no such ban on managing humans in a group within puppet?
[14:15:35] <ottomata>	 chasemp:  yeah, just curious, why?
[14:15:52] <gehel>	 we could also probably make most of the files created by the technical user to be world readable. As far as I understand, this is all data that is published on the internet anyway
[14:16:08] <ottomata>	 for system users that come from things like deb packages
[14:16:10] <ottomata>	 i undertsand
[14:16:15] <ottomata>	 but for system users that we create
[14:16:21] <ottomata>	 it seems like it might be ok to put in data.yaml?
[14:16:40] <ottomata>	 gehel:  yeah, but the problem is accessing the privatedata in hdfs
[14:16:42] <chasemp>	 admin.yaml creates users in the human UID range and not the system range etc
[14:16:58] <chasemp>	 and it gets really ugly managing users there that are shadowed manually from a puppet user setup
[14:17:10] <chasemp>	 and maybe because this use case didn't exist then
[14:17:15] <chasemp>	 I haven't thought about it forever
[14:17:19] <ottomata>	 for these types of system users though
[14:17:25] <ottomata>	 it doesn't really matter if they are 'system' users
[14:17:31] <ottomata>	 in the system uid range
[14:17:32] <chasemp>	 I disagree there
[14:17:34] <ottomata>	 ya?
[14:17:51] <chasemp>	 you are talking to the person woh spent like 5 months running down all teh places that was abused :)
[14:17:55] <ottomata>	 all that they  need is an account that is not tied to a real person that can run scripts
[14:18:09] <chasemp>	 sure, but that is teh definition of a system user
[14:18:27] <ottomata>	 ya, but what's the point of the distinction?
[14:18:34] <ottomata>	 if the accounts function in the same way?
[14:18:52] <chasemp>	 I guess, we would need to find out why unix makes a distinction 
[14:18:58] <chasemp>	 but overall I think security and sanity
[14:18:59] <ottomata>	 (btw, i thikn acls may be the correct solution here, now i'm just asking to ask :) )
[14:19:19] * gehel knows next to nothing about ACL except that they look "complicated"
[14:19:23] <chasemp>	 mixing management of system users and human users never ends well
[14:19:30] <chasemp>	 that's not a very satisfying answer
[14:20:24] <chasemp>	 so two angles on your problem seem to be accessing the end state data and access to actually run things is also controlled via posix group right?
[14:20:27] <ottomata>	 yeah, in cases like this i can't see how it would be so bad, buuuut?
[14:20:30] <chasemp>	 if only you could put one group in another
[14:20:47] <gehel>	 it looks to me that unix permissions in themselves are sufficient to solve our issue, but the way we manage those permissions are not. Adding ACL inthe mix to solve the fact that our permission management isn't flexible enough does not look like a great solution
[14:20:51] <ottomata>	 hmm, access to actually run things?  oh like to sudo to that user?
[14:20:51] <ottomata>	 yeah
[14:21:39] <chasemp>	 can not hadoop allow users in multiple groups?
[14:21:46] <chasemp>	 taht seems really hardcore
[14:23:11] <chasemp>	 I'm only understanding a portion of this sorry, it's complicated.  sorry you guys are hung up
[14:24:21] <wikibugs_>	 10Analytics-Tech-community-metrics, 10Developer-Relations: Adjust to Grimoirelab / Bitergia moving to GitLab - https://phabricator.wikimedia.org/T171290#3553080 (10Aklapper) From last email by Luis: > The workflow will be the following: > - our process exports as .tgz the SH dump every day (this is the output...
[14:24:24] <chasemp>	 ottomata: what I was going to suggest is you could move the management of alytics-privatedata-users out of admin.yaml if you can't find a way to not mix humans/system users, but its' an exception possibly no better than putting teh ssytem user in admin.yaml 
[14:24:44] <ottomata>	 > can not hadoop allow users in multiple groups?
[14:24:48] <ottomata>	 chasemp:  not sure what that means?
[14:24:57] <ottomata>	 hadoop can mostly do whatever  posix can do
[14:25:01] <chasemp>	 his was confirmed by @Ottomata. Specifically, "discovery-stats" is not in the "analytics-privatedata-users" group that would give it Hadoop & webrequest access. 
[14:25:05] <ottomata>	 as it uses the posix perms on the namenodes to confirm
[14:25:56] <chasemp>	 is hadoop access there access to run jobs against Hadoop or access to view results from jobs run or both?
[14:26:19] <chasemp>	 my expecation is hadoop has a configuration that says "users in these groups can run jobs"
[14:26:28] <chasemp>	 but maybe it's simpler than that
[14:26:32] <ottomata>	 naw, it doesn't
[14:27:03] <ottomata>	 pretty much: if you have a hadoop client that can access the hadoop master node(s), you can launch a job
[14:27:09] <ottomata>	 but, hdfs is a filesystem
[14:27:25] <ottomata>	 so data on it is restricted much like it is on a regular filesystem
[14:27:41] <ottomata>	 there are some techs that allow for finer grained control of cluster access
[14:27:50] <ottomata>	 hive table level access, probably yarn job launching
[14:27:57] <ottomata>	 but they mostly require kerberos stuff
[14:29:19] <ottomata>	 e.g. https://cwiki.apache.org/confluence/display/SENTRY/Sentry+Tutorial
[14:31:51] <chasemp>	 I think I know what I would do but let's do a call and talk it over next week (if you can wait?)?
[14:31:53] <chasemp>	 I'll try to help
[14:32:04] <chasemp>	 but I'm distracted right now and it's Not Simple
[14:32:22] <chasemp>	 but my thinking is essentially you need to separate the admin.yaml group from the group Hive uses
[14:32:37] <chasemp>	 but there is no such thing as subgroups in posix
[14:32:52] <ottomata>	 right, that's what the ACLs are for
[14:32:53] <chasemp>	 so what if there was a simple function that merged two arrays of users (human and system)
[14:33:23] <ottomata>	 chasemp:  like just a separate specification of system users that should be in analytics-privatedata-users
[14:33:32] <ottomata>	 and then admin module wouldn't remove the user from that group?
[14:33:45] <chasemp>	 what I mean is more like
[14:34:00] <chasemp>	 make analytics-privatedata-users an autogenerated group w/ multiple sources
[14:34:10] <chasemp>	 one of which is mirrored users from a human group in admin.yaml
[14:34:19] <chasemp>	 and another source is an array defined in hiera of system users
[14:34:20] <chasemp>	 so
[14:34:48] <chasemp>	 analytics-privatedata-users would not directly be in admin.yaml and the entirety of the group on system could still be managed
[14:34:54] <chasemp>	 but it would be teh result of multiple sources
[14:35:00] <chasemp>	 and it would still be dynamic
[14:35:03] <chasemp>	 idk, a thought
[14:35:13] <ottomata>	 > analytics-privatedata-users would not directly be in admin.yaml
[14:35:13] <ottomata>	 ?
[14:35:16] <ottomata>	 NOT?
[14:35:23] <chasemp>	 :) not
[14:35:31] <ottomata>	 but, multiple sources?
[14:35:36] <ottomata>	 what is the source for the real users?
[14:36:02] <chasemp>	 Ok two problems afaict
[14:36:14] <chasemp>	 access to the underlying systems are controlled via one posix group 
[14:36:26] <chasemp>	 that posix group is atm all humans and admin.yaml isn't prepared for more
[14:36:41] <chasemp>	 and results are controlled the same way from teh same ecosystem
[14:36:49] <chasemp>	 we need an extra layer than can take a group from admin.yaml
[14:36:54] <chasemp>	 and a group of system accounts
[14:37:06] <chasemp>	 and manage a merged group for access to shared resources by human and system users
[14:37:13] <ottomata>	 oh
[14:37:15] <ottomata>	 a new group?
[14:37:38] <chasemp>	 sure you would need a new group in admin.yaml that you wanted to be used to derive the end state analytics-privatedata-users on host
[14:37:40] <ottomata>	 that includes everybody in apu, + system users in that group
[14:37:50] <chasemp>	 it's a thought
[14:37:55] <ottomata>	 i see
[14:38:10] <ottomata>	 oh, but the system users in that new group would be declared in hiera
[14:38:18] <chasemp>	 right
[14:38:18] <ottomata>	 not via user {} puppet resource
[14:38:31] <chasemp>	 they would be declared via user{} somewhere w/ system set to true
[14:38:43] <chasemp>	 and you coule piggyback on the existing admin module functions
[14:38:55] <chasemp>	 and just verify that all users from the non-admin.yaml source are indeed in teh system user UID range
[14:38:58] <chasemp>	 and it would be fairly safe I think
[14:39:15] <ottomata>	 hm interesting
[14:39:30] <chasemp>	 the mechanism that manages users in a group just takes an array iirc
[14:39:49] <chasemp>	 so how you compile that array is negotiatable if called from outside of the initial admin.yaml run
[14:39:50] <ottomata>	 so soemthing in hiera like
[14:40:26] <ottomata>	 systemuser_group_merges:
[14:40:26] <ottomata>	   analytics-privatedata:
[14:40:26] <ottomata>	     systemusers: [sys_userA, sys_userB]
[14:40:26] <ottomata>	     groups: [ analytics-privatedata-users ]
[14:40:37] <ottomata>	 then we could chown the hdfs data to analytics-privatedata
[14:41:03] <ottomata>	 chgrp*
[14:41:07] <chasemp>	 yes
[14:41:28] <chasemp>	 and a simple merge of those two arrays with optional safety validation and a call to admin::groupmembers in some analytics module
[14:41:32] <chasemp>	 would I think get you all you want
[14:41:39] <ottomata>	 K, interesting.  I will note this and the ACL idea on the ticket
[14:42:11] <chasemp>	 for my money this woudl be simpler than setfacl things even though it's moving parts
[14:42:31] <chasemp>	 it's fairly easy to debug a hash merge and all other compomonents are basically the same
[14:43:01] <chasemp>	 ok I gotta get back to what I was doign ottomata, drop me a meeting if you want to talk on hangout about it :)
[14:43:35] <wikibugs_>	 10Analytics, 10Discovery, 10Discovery-Analysis (Current work), 10Patch-For-Review: Private data access for non-person user that calculates metrics - https://phabricator.wikimedia.org/T174110#3553136 (10Ottomata) IRC convo with @chasemp, had 2 ideas:  1. use ACLs.  https://hortonworks.com/blog/hdfs-acls-fin...
[14:43:40] <ottomata>	 ok great, thanks chasemp
[14:43:42] <chasemp>	 (make an admin::mixed_groupmembers function that does this calling admin::groupmembers at the end w/ the validation in teh middle)
[14:43:48] <ottomata>	 the trouble is, either of these solutions are going to take a bit to get working
[14:43:56] <ottomata>	 so, not sure what to tell gehel and bearloga right now
[14:44:06] <chasemp>	 it's a few days of work I think yeah
[14:44:17] <ottomata>	 bearloga:  could you disable the automation of the hive based jobs, and for now just run those as yourself?
[14:45:50] <ottomata>	 haha, we should just make every user group have a corresonding system user that is sudoable to by users in that group
[14:45:58] <ottomata>	 :p
[14:46:26] <ottomata>	 chasemp:  would that be so bad?  if it wasn't configurable, but just the default that always happened?
[14:47:03] <chasemp>	 it's maybe not a terrible idea bu it's a much bigger change with broader implications I think
[14:47:23] <ottomata>	 or, maybe it could be enabled selectively for the group?  but you couldn't configure the name
[14:47:32] <chasemp>	 yeah that would be doable
[14:47:49] <chasemp>	 I'm mulling it over, not a terrible thought
[14:48:00] <chasemp>	 service_account => true,
[14:48:03] <chasemp>	 in admin.yaml
[14:48:13] <chasemp>	 etc
[14:48:20] <chasemp>	 ok I really got to get back :)
[14:49:07] <chasemp>	 (then you have to manage teh sudo perms for it separately already tho so it's not too much less than the generated group idea I think -- later)
[14:49:29] <ottomata>	 aye
[14:49:31] <ottomata>	 oook, laters! 
[14:49:33] <ottomata>	 thanks chasemp
[14:49:51] <wikibugs_>	 10Analytics, 10Discovery, 10Discovery-Analysis (Current work), 10Patch-For-Review: Private data access for non-person user that calculates metrics - https://phabricator.wikimedia.org/T174110#3553147 (10Ottomata) Oo, one more idea:  3. Make every user group have a corresponding system user, that could be se...
[14:55:28] <mforns>	 hey ottomata want to cave 4 mins pre-standup?
[14:57:36] <ottomata>	 mforns:  sure i gotta run really super son though
[14:57:40] <ottomata>	 wasn't going to make standup
[14:57:45] <ottomata>	 in cave
[15:03:15] <ottomata>	 mforns:  just in case, maybe you don't have branch creation perms?
[15:03:20] <ottomata>	 i'll create one for you real quick
[15:03:23] <ottomata>	 called mforns-dev
[15:03:24] <ottomata>	 ok?
[15:03:28] <mforns>	 ok ottomata thanks!!!
[15:03:46] <ottomata>	 done
[15:03:50] <ottomata>	 now you shoudl be able to push to gerrit with
[15:03:56] <ottomata>	 git review mffons-dev
[15:04:02] <ottomata>	 git review mforns-dev
[15:04:06] <ottomata>	 ok byyye
[16:09:16] <wikibugs_>	 (03PS1) 10Mforns: [WIP] Close database sessions to avoid max connection issue [analytics/wikimetrics] (mforns-dev) - 10https://gerrit.wikimedia.org/r/373921 (https://phabricator.wikimedia.org/T173585)
[16:10:05] <wikibugs_>	 (03CR) 10jerkins-bot: [V: 04-1] [WIP] Close database sessions to avoid max connection issue [analytics/wikimetrics] (mforns-dev) - 10https://gerrit.wikimedia.org/r/373921 (https://phabricator.wikimedia.org/T173585) (owner: 10Mforns)
[16:10:59] <wikibugs_>	 (03CR) 10Mforns: [V: 032 C: 032] "self merging to deploy the code to staging and be able to test" [analytics/wikimetrics] (mforns-dev) - 10https://gerrit.wikimedia.org/r/373921 (https://phabricator.wikimedia.org/T173585) (owner: 10Mforns)
[16:13:33] <wikibugs_>	 10Analytics-Kanban, 10Analytics-Wikistats: Add edits endpoint to AQS using druid as a backend - https://phabricator.wikimedia.org/T174174#3553376 (10JAllemandou)
[16:13:57] <wikibugs_>	 10Analytics-Kanban, 10Analytics-Wikistats: Add edits endpoint to AQS using druid as a backend - https://phabricator.wikimedia.org/T174174#3553376 (10JAllemandou) a:05Milimetric>03JAllemandou
[16:15:43] <wikibugs_>	 (03PS1) 10Mforns: Revert "[WIP] Close database sessions to avoid max connection issue" [analytics/wikimetrics] (mforns-dev) - 10https://gerrit.wikimedia.org/r/373922
[16:16:37] <wikibugs_>	 (03CR) 10jerkins-bot: [V: 04-1] Revert "[WIP] Close database sessions to avoid max connection issue" [analytics/wikimetrics] (mforns-dev) - 10https://gerrit.wikimedia.org/r/373922 (owner: 10Mforns)
[16:37:13] <wikibugs_>	 (03CR) 10Mforns: [V: 032 C: 032] "Reverting changes to mforns-dev (staging failed)" [analytics/wikimetrics] (mforns-dev) - 10https://gerrit.wikimedia.org/r/373922 (owner: 10Mforns)
[16:40:00] <wikibugs_>	 10Analytics, 10DBA, 10Data-Services, 10Research, 10cloud-services-team (Kanban): Implement technical details and process for "datasets_p" on wikireplica hosts - https://phabricator.wikimedia.org/T173511#3553473 (10bd808)
[16:41:38] <wikibugs_>	 (03PS1) 10Mforns: [WIP] Close connections between report executions [analytics/wikimetrics] (mforns-dev) - 10https://gerrit.wikimedia.org/r/373925
[16:42:28] <wikibugs_>	 (03CR) 10Mforns: [V: 032 C: 032] "Self-merging for staging test (branch mforns-dev)" [analytics/wikimetrics] (mforns-dev) - 10https://gerrit.wikimedia.org/r/373925 (owner: 10Mforns)
[16:42:30] <wikibugs_>	 (03CR) 10jerkins-bot: [V: 04-1] [WIP] Close connections between report executions [analytics/wikimetrics] (mforns-dev) - 10https://gerrit.wikimedia.org/r/373925 (owner: 10Mforns)
[17:42:07] <bearloga>	 gehel: are you still here or are you signing off for the weekend? if you have a minute, could you please comment out https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/statistics/private.pp#L63 for now? i will run the scripts under my account until the system user management stuff is agreed on
[17:43:11] <bearloga>	 gehel: also i think you'd probably need to kill the reportupdater process that started up last night from cron
[18:47:52] <wikibugs_>	 10Analytics, 10Analytics-Wikistats, 10Operations, 10Wikidata, and 6 others: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3553954 (10Marostegui)
[19:59:40] <wikibugs_>	 (03PS1) 10Joal: [WIP] Add wikistats endpoint with edits metric [analytics/aqs] - 10https://gerrit.wikimedia.org/r/373961 (https://phabricator.wikimedia.org/T174174)
[20:00:27] * joal is happy ! AQS-to-Druid was not that hard to put together :)
[20:00:35] <joal>	 fdans, mforns --^
[20:00:51] <mforns>	 joal, oooooh!!!!!
[20:00:53] <joal>	 when you have a minute, can you please double check the approach is ok for you?
[20:00:58] <mforns>	 sure
[20:01:27] <joal>	 functionaly it works, tested at home - BUT: Difficult to unit-test - I need to come with something :)
[20:01:32] <Nettrom>	 are there any known issues with the mediawiki_history data in Hadoop? I’m getting a bunch of errors when running queries
[20:01:39] <mforns>	 ok
[20:01:49] <mforns>	 Nettrom, are they related to timestamps?
[20:01:56] <Nettrom>	 mforns: looks to be, yes
[20:02:13] <mforns>	 joal, I also get timestamp errors when querying mediawiki_history
[20:02:42] <joal>	 hm - Nettrom, mforns: what type of error?
[20:03:02] <Nettrom>	 should I just copy & paste the error?
[20:03:15] <mforns>	 joal: Hive Runtime Error while processing row [Error getting row data with exception java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable
[20:03:16] <joal>	 in a gist or something Nettrom :)
[20:03:33] <Nettrom>	 or maybe just +1 what mforns just posted
[20:03:42] <mforns>	 did I miss a serde import?
[20:04:16] <joal>	 Arf, this is really my bad guys, give me a minute please - those timestamps should be strings (even if in timestamp format)
[20:04:28] * joal goes fixing hive again
[20:04:42] <Nettrom>	 joal: no worries, great to hear it can be fixed :)
[20:04:56] <fdans>	 joal: you hero :)
[20:05:21] <joal>	 fdans: I feel no hero when I fix stuff I break ;)
[20:05:35] <joal>	 but thank you anyway fdans :)
[20:05:39] <joal>	 feels good ;)
[20:05:41] <mforns>	 joal, let me help you (if I can) it's too late for you
[20:05:50] <fdans>	 haha I meant re: aqs to druid
[20:05:57] <joal>	 Ah :)
[20:06:19] <mforns>	 joal, batcave?
[20:06:28] <joal>	 sure mforns 
[20:06:31] <mforns>	 k
[20:07:35] <joal>	 mforns, Nettrom: can you try again?
[20:08:14] <joal>	 fdans: same to you, can you please have a look and validate the approach (aqs-to-druid)?
[20:08:48] * Nettrom tries again
[20:09:51] <Nettrom>	 success! thanks so much joal, awesome quick fixing :)
[20:10:13] <joal>	 np Nettrom, thanks for letting me know :)
[20:13:42] <joal>	 a-teams and fellows, I'm gone :) See you next week :)
[20:17:09] <mforns>	 byeeeeeeeeee
[20:47:43] <wikibugs_>	 (03Abandoned) 10Mforns: [wIP] Close all sessions to avoid database connection errors [analytics/wikimetrics] - 10https://gerrit.wikimedia.org/r/373078 (https://phabricator.wikimedia.org/T173585) (owner: 10Mforns)
[20:53:28] <wikibugs_>	 (03PS1) 10Mforns: Clear connections between report executions [analytics/wikimetrics] - 10https://gerrit.wikimedia.org/r/373967 (https://phabricator.wikimedia.org/T173585)
[20:54:18] <wikibugs_>	 (03CR) 10jerkins-bot: [V: 04-1] Clear connections between report executions [analytics/wikimetrics] - 10https://gerrit.wikimedia.org/r/373967 (https://phabricator.wikimedia.org/T173585) (owner: 10Mforns)
[20:55:04] <wikibugs_>	 (03CR) 10Mforns: "I tested this in staging and works. It solves the problem of max user connections. I also tested that all other functionalities of wikimet" [analytics/wikimetrics] - 10https://gerrit.wikimedia.org/r/373967 (https://phabricator.wikimedia.org/T173585) (owner: 10Mforns)
[21:00:42] <wikibugs_>	 10Analytics-Kanban, 10Patch-For-Review: Troubleshoot Wikimetrics "magic button" - https://phabricator.wikimedia.org/T173585#3554375 (10mforns) Ok, I think I found a fix! The last patch is the one that counts, the previous ones, are commits to the mforns-dev branch, to be able to deploy to staging and test. I w...
[21:06:08] <mforns>	 have a nice weekend! bye
[22:48:33] <wikibugs_>	 10Analytics, 10Analytics-Wikistats, 10Operations, 10Wikidata, and 6 others: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3554755 (10gh87)