Unknown Speaker 0:00 was, um, Unknown Speaker 0:02 I just heard that this meeting has been recorded. I'm sure yesterday was a long day for everyone I know it was, for me, I'm sure we are all saturated and hopefully not super saturated, where all the information starts coming out of solution. We've got a lot of really exciting physics and lessons to go through, we're going to start seeing more data seeing more plots. And for those of you hopefully that were able to see Jesse's colloquium yesterday, I personally found it very inspiring. It gave me a lot of ideas about what to do with this data. And hopefully, it'll give you a lot of ideas of what to do with the data as well. Um, I'd like to introduce Julie Hogan. Some of you have seen Julie remotely on the video yesterday. She's a professor at Bethel University in Minnesota in the US, and she's going to continue walking us through some discussion of physics objects, which is where a lot of the physics gets done that you guys will be doing. And I will pause there and hand it over to to Julie. Unknown Speaker 1:04 Okay, thanks a lot, Matt. Oh, I cannot screenshare it is disabled. Unknown Speaker 1:16 So as soon as I can share my screen, what we'll do is we'll start by, you know, getting getting the Dockers back up running. We'll we'll head to the lesson web page. Unknown Speaker 1:27 There we go. Okay. Unknown Speaker 1:33 So now you see my screen. I want to I want to thank you guys for for your graciousness yesterday, I was really sad I couldn't, couldn't be part of that. I'm here in a hotel in Texas, giving you give me this workshop. Okay, so this is, uh, this is the lesson webpage for for today's thing I got here from the schedule. This is the first lesson on Thursday. And so what we're going to talk about today is jets and missing transverse momentum. That's going to be our big chunk this morning. And then we'll move forward into kind of how do we run this ad to nano EOD tool to get all of your files, you won't actually have to do this, but will hopefully give you you know, enough information that you feel confident to do it later when you when you have the need. And then we will look at the Higgs to tau tau analysis in in more detail before lunch. So we will actually dive into that and look at how we make selections for an actual physics analysis using all of these objects that we have put together in the nano God files that we've created. So we're going to look at three big questions in this lesson right now, what are jets and missing transverse energy, I kind of took the tactic of pretending you all are my undergraduate students just so that I didn't miss things. So hopefully, for some of you, it will be very boring. And we're at the very beginning. And then we'll talk about how we can identify jets from beat corks, tagging concepts, and then how do we deal with differences between data and simulation? So I'm going to go ahead and move in first, here to CMS Jetson met. And so again, how are Jetson missing transverse energy access is our main question here, we'll identify these objects in the in the files, understand some of their basic features. And then and then do a little bit of practice accessing them. As I mentioned in the video, last time, these are objects that we create from the Particle Flow candidates in CMS, we we have many different types of jets and method can be reconstructed. But the most common for analysis use are the Particle Flow versions. And so in terms of Particle Flow, these are not unique objects. They're, they're combinations of different objects or calculations from different Particle Flow candidates. And so that makes them much more complex and worthy of their lesson. So let's take a look just at some basic properties of objects. We have got many Particle Flow candidates and an event and some of them if you if you look at like a map, or you look at one of those, when you render your event in Lego, you look at it and you see that that some of them are spatially grouped. And so what we're what we're seeing in the detector is that when hybridization occurs as this quirk is coming out from the collision or whatever, we get a spray or a shower of long, long lived particles in the sense that they traverse the detector. Even if they're not stable. But we get these these sprays of particles from in the detector together and their their energy can in their in their spatial deposition can tell us something about the initial particle that that produced, produced them. And so we had this, this actually connects back if you remember that just giant chunk of red and blue Lego bricks, we've got usually a combination of charged hadrons photons and neutral hadrons, making up jets because there's such a wide variety of particles that produce them. But there could be hundreds of Particle Flow candidates in a jet, you can actually ask a jet how many constituents do you have? And you can get you'll get very large numbers. So if someone told me Julie, you must do a physics analysis from only Particle Flow candidates? Unknown Speaker 5:49 I would I would do a B physics analysis or something that didn't didn't require any, any jets. So how do we put them together from the Particle Flow candidates, this might be very familiar to some of you. Because this is not a CMS approach, this is this is just jet clustering in general. So we have this clustering procedure, where spatial relationships momentum relationships between particles are evaluated. And if the relationship meets a certain criteria for those jets to for those particles, rather, excuse me to be combined, then they are combined. And you do this iteratively until every combination of particles that remains fails the criteria. And so CMS, again, will use many different types of clustering algorithms. Oh, I apologize. Is my audio not good? Unknown Speaker 6:53 I hear you fine. Okay. Unknown Speaker 6:55 I also hear you fine. Yeah. Unknown Speaker 6:56 Okay, nice. She just says she might be having trouble. Unknown Speaker 7:01 Okay, I'll continue. So this is an awesome graphic that one of that I believe one of my CMS friends put together Paul, she's the one in the theory community if that's not true, but this is this is just a graphical representation of what it looks like to combine particles into jets and we just we look for the two particles with the smallest separation according to this function di j and continue merging them. So before I scroll down past the function, I'll just point out that that all of these powers here on the functions are a negative two, minimum ptti ptj with the power of negative two, excuse me. And so that means we're using anti k t clustering. So these these k, t, anti k t Cambridge rockin algorithms, typically in CMS analyses, we use the anti k t jets. And so we would land at something that looks like the final picture, where we've reached the end of the clustering sequence, we stopped combining particles together Particle Flow candidates together and we would identify the groupings as a jet and inside a jet there are constituents okay. So, the Jets end up looking something like this, this is a visual representation of some anti k t jets in the rapidity why and as we fill angle five space, and so they have kind of a circular shape, they're usually centered around the hardest particle that that was that group or that shower. And so these jets are fairly straightforward to understand conceptualize in your brain, they have a they have kind of a conical shape. And all of the Particle Flow candidates that are kind of within that that color grouping would be would be constituents of the same day. Okay. We do have to deal with pile up in our in our jets. So we have multiple particle interactions in the same bunch crossing of large Hadron Collider. So we it would be great if we could get all of our data from just one interaction for crossing, but it would be too rare for that inner for that collision to be interesting. And so in order to maximize the chances of interesting collisions, we have multiple of them per bunch crossing. And so we have to deal with the gunk that this puts into the detector. There are two methods in CMS. And so the most common in run one was was called charge hadron subtraction. The most common upcoming and run three and an already gaining popularity and run two is called puppy Which is where we subtract off pile up kind of on a per particle basis with a system of weights, as opposed to charge hadron subtraction, where tracks that can be connected directly to a different primary vertex can be removed from the jet. And so we'll look, we'll talk a little bit more about pileup in the third episode when we do the jet energy corrections, because that is typically where we deal with that for these for these jets. Unknown Speaker 10:31 So Unknown Speaker 10:32 let's look at accessing jets in the CMS software if you have your code up from yesterday, or we'll take a pause, and you can and you can pull it up, I would like to look at the Jet section of the AOD to nano eod.cc source file. And this code block is just for information. But this is showing you is that we're opening a collection of jets based on the handle just the same way you did yesterday in the exercises for muons, and electrons, and Taos. And so we can access the PTA defy mass, all of those things in exactly the same way. And someone asked a great question on mattermost. Last night, how do we find the list of all those methods? Which for vector methods are possible? And so the answer is, I did find, you know, a header that has kind of a list of the generic momentum energy variables that you can, you can pull. And also, you can use these for even if you are just confident to get these four, you can build some sort of tolerance vector, there are a couple of different couple of different classes that can do that and give you any variety of for vector information that you need. So you can check back on the matter most in the physics object one from from last night. Home, that was one of the more recent bits of discussion, so you can hopefully find it pretty quickly. Unknown Speaker 12:08 Yes. Unknown Speaker 12:21 Thank you. That's a great question. Unknown Speaker 12:24 So AK, I'll go back up to this picture. AK is our abbreviation for anti kt. Okay, so whenever if you saw Katie, that would just be the KT clustering algorithm. A k is the anti k d clustering algorithm. And the five refers to how wide the radius in Angular space is of these jets of the cones. And so five stands for a radius of 0.5. And so that would be the parameter put into the clustering algorithm for how large to make the cones? That's a great question. So we're loading in from an input tag, the anti KTX, radius 0.5, Particle Flow jets, and calling them calling them a vector called jets. Unknown Speaker 13:14 So Unknown Speaker 13:16 I neglected to change. This is I believe, just a website problem, I neglected to change this where it says the handle is called callow jet collection that should be a PDF collection. I think that is correct in the in the in the code that you're looking at in front of you. So what we're going to do first challenge here, what we're going to start with is looking at identification for these jets, because just like all the other objects from before, we need to filter out some noise, it is totally possible for noise in the detector to be clustered as a jet. And so we do have a defined jet ID. I've linked here to a paper that you can look at later. But it says it makes this statement. Particle Flow jets are required to have a charge hadron fraction greater than zero, if they're within the tracking fiducial region of you know certain range of pseudorapidity they must have a neutral hadron fraction of less than one charged electromagnetic fraction less than one and neutral electromagnetic fraction less than one. So so what this is basically saying is that 100% of the Jets energy is not allowed to come from one type of particle, whether that's an electron or a photon, or a neutral hadron, it cannot be 100%. And if it is within the region of the tracker, there must be at least something that's charged in that in that jet. So as just a little as just a little warm up, we'll take a moment or two To look at this and let everybody get their code back up, in the actual GitHub repository, I'll open this up for CMS SW, we can look at one of those header files, to see what we can do with particles, jets. And then what I would like to do try and do is implement this ID on your jets. So in this in this loop, you can make us a, you know, an if statement to get rid of the jets that would not pass these criteria. So I've just clicked on, I've just clicked here on data formats, Jet Rico. And we're in the CMS SW five, three x branch, here are the of the CMS software GitHub repository. And so this is this is the holder of everything. And so I'm gonna, I'm going to go to the header files. That's, that's where I like to start to, to understand the methods. And I'm going to scroll down to pf jet, there are a bunch of different things in here. You can check back as I noted in the challenge text, you can check back to the actual AOD to nano EOD source code to to confirm this, what I'm going to click on pf jet. And what we have as methods, here available are a whole bunch of different energy fractions versus charge hadron energy, that's going to be a number that's a fraction of the Jets energy, same for neutral hydrogen energy, photon energy, electron energy, etc. Unknown Speaker 16:43 Julie sorry. There's a question from Jessie. Okay. Hi, Julie. So a question about the the jet ID Unknown Speaker 16:52 you mentioned pileup is the jet ID stable as you change the amount of pilots are those requirements ones where I have to be careful about how pilot might change, let's say, you know, the requirement that you use are below those inequalities. Because, you know, just a little bit of pile up could easily make a jet pass the quality criteria? Unknown Speaker 17:16 Um, yes, that's right. So So this, if you if you want to put more labels on it, this is what we would call a noise jet ID as opposed to a pile up jet. Id got it. So the goal of this is not as you say, to filter out all the pilot jets, but rather to filter out things that are not actually just Unknown Speaker 17:35 great, thank you. Unknown Speaker 17:42 So here are some of these methods and we've got just the ability to to extract the charge hadron energy fraction, here we go charge on energy fraction, neutral hydrogen energy fraction, photon energy fraction, electron energy fraction and so on. There's a there's actually several more So see if see if as a warm up, you know, take a take a few minutes and then put together this if statement. Go back to the sharing the page that has this the sentence. So if anybody has it, any questions? Well, we'll take we'll take a pause for a few minutes. Unknown Speaker 19:55 Okay, I think in the interest of time To show you an example of some Unknown Speaker 20:05 of a scripts where we've done this Unknown Speaker 20:09 Oh, no. Unknown Speaker 20:17 Um, so this is a this is a little example Unknown Speaker 20:21 of how this might look to apply the jet ID. And so this in this example, they've, they've pulled out these four different pieces neutral hadron neutral electron charged, hadron and charged electromagnetic energy. And they've actually constructed the fractions here by dividing by the raw energy of this of this jet. And so we'll talk about what that means in a moment later. But before any corrections are made to the jet energy, we want to evaluate these fractions. And so that's what they've done for you and your jets, you're safe to use just that fraction method, rather than doing this, and then we just put together the inequality. So a good jet is equal to this value less than something this value less than something else, this value greater than something this value less than something. And in this case requires some constituents. This is a little bit newer, newer version of the of the ID. And so you can see how this is put together. And then we would basically skip things if they are not a good jet. In this case, we're printing that it's a bad jet. And we can learn about and understand why it's a bad jet, but then we just skip and do nothing else. So your if statement might look something similar. If this and this and this and this, keep going else do nothing. So does anybody have any other questions about that type of aspect of jets before we move on to met? Unknown Speaker 22:01 Okay, sounds like we'll go on. So here you have just as a as a reminder, you got the reference to to the header files. So you can find, trace them back and find methods if you if you're curious what other methods exist. So let's talk briefly about missing transverse momentum. This is the negative vector sum of the transverse momenta of all the Particle Flow candidates in an event. So we we refer to the vector as missing transverse momentum. And we refer to its energy as missing excuse me to its magnitude as missing transverse energy. So I will mix them up without paying any attention to the way I'm talking about it. But we usually just use them the acronym net. And the assumption is that these are, this is either fake, it doesn't actually exist in its residue of detector mis measurement, or it represents particles like neutrinos or or dark matter or Susy, lightest supersymmetric particles or whatever. And so this is our way of seeing detect objects that would not interact in the CMS detector. So what we will look at now is one type of met collection and then in the next lesson, we will look at a different type excuse me in the in the final episode here. So Matt is is constructed from all the Particle Flow candidates. But it also relies on information about jets because we have ways of calibrating some of these, some of these objects after Particle Flow. So jets, we can calibrate their energy after Particle Flow, and we propagate that to the to the Met, which we'll show you later. Unknown Speaker 23:44 So in your in your script, Unknown Speaker 23:48 when you hire you don't remember if it's above or below the Jets, but at some point, we open a pf met collection. And so it has kind of a similar label to the Jets. Typically, we like to link together the algorithm that we have used for met and for jets, because they are so intertwined together, so we're using the AK five pf jets will open, also the PF met. And this structure looks different, we do not have a for loop over over items in a collection even though it is called a pf met collection. We really just have one object, which is this calculation based on all the Particle Flow candidates. And so the important things to pull out of met for using it exactly as a vector is is the PT in the file doesn't have a well defined ADA. It doesn't have a well defined z component. And so we don't ask for the ADA of the net. We just asked for the PT and for the five and in this example, we've also pulled out from this calculation, we can actually get the sum of all the transfers energy in the event as a as a positive value. So we can add this up scalar sum of all the energy, the transfers energy and the event which can be very interesting variable as well, depending on your physics process. And so this is the this is the, what we call met PT is the actual missing transverse energy. And the Met fi is where it's directed an angle. The second block here is what's called med significance. And I really like about significance, because this is what I worked on for my master's thesis on D zero was getting, getting the missing energy significance algorithms working for the zeros wrong. And so what we what we have here is a value of a significance. That's the first line. And we also have some, some uncertainty, some covariance matrix elements. And what this significance means is, it's a way to measure whether the missing energy came from detector noise or from real particles that did not interact in the detector. And so you use this combination of your knowledge about how well you can measure different physics objects, to understand whether this met pointed exactly in this way, aha, it's likely it's more likely to be a detector Miss measurement, or Oh, note, the Mets pointed over here, and it has this magnitude, and all my other objects are far away. So it's a way of numerically codifying mental decisions like that. So I have, you know, I don't think this is the most updated version of this twiki, or of this website. Unknown Speaker 26:51 I'm Unknown Speaker 26:54 interesting. I was sure I got rid of things like this, but I Oh, you know what, I've updated the code block, and I forgot to take out the word test me. So Nevermind. All right. So I've put in a challenge here, actually, we're a little bit behind. So I think I'm gonna leave this for you all to do a synchronously if you would like later. But it's it's very interesting to look at the missing energy and and learn how to infer differences between real and fake. And so we have in the actual configuration files here, for simulation, what I've put is is another test file, so you've got the TT bar test file you use yesterday, and then you also have now a drill yet test file. And so you could run this with a drill yam test file, which we would expect to have missing energy only from detector Miss measurement in general, versus tiki bar where we could easily have, you know, several neutrinos in the event producing the same energy. And so this is kind of interesting to just play around, play around with the two different files and look at differences in the Met or perhaps then that significance. So let me take some questions. Let's take a small pause, and then I will actually move on to the tagging episode. Unknown Speaker 28:22 So if you open up, let's see if I have got the code here. If you open up your simulation configuration file in the configs directory, ah, you know what we need to get to the we need to get to the point where I tell you to do a git pull. Which I thought was just here at the beginning, but maybe is in the jack Unknown Speaker 28:54 maybe is in the jack episode. Come on. Hello, there. Unknown Speaker 28:59 So let's, let's get to that point. Yeah, we'll keep going. And once we get to the point where we do a git pull, you'll have a new test file right here. Unknown Speaker 29:11 Any other questions? Sorry about that. Okay. Unknown Speaker 29:23 Yes, sorry. I have a question that maybe it's related. I'm just I'm just thinking. If I'm wrong, please let me know. The collisions in the experiment doesn't happen completely head on. What I'm saying is that there is a like a crossing angle. Unknown Speaker 29:44 Right? So Unknown Speaker 29:46 in theory, we have our head on, like a perfect line collision, practice, it doesn't happen. So So and in theory, we shouldn't be able to To like balance, the momentum, I mean, just to use the fact that we have zero momentum in, in the, in one plane. And well, so the balance which we balance in that in that plane, we know that there is if there is any momentum like last, we will assign this hidden momentum to the Met. Right. So, but how to deal with that? Because, I mean, I'm kind of thinking that we, we, like, give these these, uh, these, uh, like, lack of precision in day to day, in the balance to the Met. Unknown Speaker 30:49 Yeah, I will be completely honest with you. I don't, I don't know, if CMS does any special treatment to account for that, I don't know, actually the detector or the collision, set up well enough to know if it's always a repeatable imbalance, if that makes sense? Or would it randomize itself out? Um, we do assume that in the transverse plane, we just have to make the assumption that we sum to zero. And so you're right, everything that would would come from an imbalance in the initial state, to my understanding would get propagated to the Met. And as end user analysts, other than assigning uncertainties on the value of met and its and its various corrections. We don't specifically have a procedure to say this is how we account for crossing angle imbalances. Unknown Speaker 31:55 But that's a really good question. I have not thought about that before. Unknown Speaker 32:02 I would I will assume it is is a small impact compared to many of our corrections while they are Unknown Speaker 32:07 in the micro. Micro right. Unknown Speaker 32:12 Okay, range, but but you know, I mean, you see the micro rat range, but we are looking at the physics in a Unknown Speaker 32:21 good Unknown Speaker 32:23 shape. So is, so even if this is if this angle crossing angle, he says Mo, we we have a I mean, it's not negligible. So it's not zero. So, probably, we'll have to do some corrections for that. I don't know either, if there is any correction, but this is something that I completed, I do not understand completely from from our government, then. If this is a concern or not, but Unknown Speaker 32:54 probably theories will complain about that. Unknown Speaker 32:59 Well, I mean, I think that I think that's a great question. So let's put a pin in that. And then, you know, I will certainly try to follow up with the with the experts and see, do you know, is this correction? One of the other corrections that we're making? That's possible. Okay, thanks a lot. So let's take a look. Let's take a look at heavy flavor tagging. This is this is one of the really active areas of development within jets, of course, is tagging, we'll talk specifically about heavy flavor, as opposed to tagging up massive objects like like w jets, or Higgs jets, top jets, because I think Unknown Speaker 33:38 a lot of the Unknown Speaker 33:42 a lot of the methods are either not CMS specific because they're calculating or something that's kind of more generically Unknown Speaker 33:50 eligible. Unknown Speaker 33:53 And so let's let's look at heavy flavors, and the concepts transferred to those other objects if that's where you're interested in. Okay, so be tagging algorithms have been around for a long time, they're based on special properties of B hadrons. With respect to light hadrons, where we expect that in these be hadrons, in particular, have a large mass and a long lifetime. And so you can look for interesting things like secondary vertices where where your B hadron has traveled and then decayed. And so with all these properties, you get interesting differences in the properties of the tracks and the vertices within jet compared to like CT decays. And those can be exploited in different ways. And so this is their their kind of really simple methods, let's just count the number of tracks that have a significant impact parameter. If the if the collision is here, and the impact parameter of this track is way out here, it probably emerged from a secondary vertex, rather than from the primary interaction. And so you can do simple things like that. Of course, We tend to gain more information by putting by putting lots of different variables into machine learning applications. And so one of the big workhorse B tagging algorithms for CMS is called the combined secondary vertex algorithm. And this is based on based on a boosted decision tree. And so we get the value that we will that we will look at is is a range from from zero to one or negative one to one. So it looks like a BDT output discriminant. And that's taking in variables about about the Jets themselves about the tracks inside about the vert secondary vertices inside and pooling them together for the maximum amount of information. So we're going to actually look today at two methods of accessing this, we're going to look at the first one right now, which is to use the same pf jets we have used before. So as again, we're doing everything in AOD to nano eod.cc. So if you look at the I've combined some code snippets from different places in the code, there's lots of gaps here. But these are the highlights, we have two header files that have been included here, this one we looked at before for the PF jet. And then there's also this type of class called a jet tag. And so what we're going to see is that for Part Four Rico jets where we have this class that's a Rico Colin Colin jet. The tagging information is stored separately with references between the objects. So when we get the the jet handle when we open up the AK five pf jets, we get Ricoh jet objects, we also need to open this collection in which lives in the input file, which is called a basically, the input tag is combined secondary vertex V jet tags. But this looks let's see if I can show you would really like to stay on this screen computer. Unknown Speaker 37:22 In the data thing, you may have this file or you may not we'll do a git pull in a moment. Unknown Speaker 37:29 I've just stored an EDM dump event content. So you did this interactively in the exercise yesterday, and I've just written the results into a text file, so that we can search through them. So a combined secondary vertex B jet tags, here they are right here. This is an extremely long class type name. It's an EDM Association vector between an EDM reference to base product of Rico jet and a vector float. So it's got all of these types in the association vector. So it's a, it's a way of linking between a reco jet and a vector of floats. So the vector of floats is going to be the actual v tagging algorithm number. And then, of course, the Ricoh jet is is which jet it belongs to. And so we're going to open up this really complex thing, and look at how to attach the values individually to a jet. So when we begin to loop over the Jets, you see that, you know, we make some, we make some criteria, there's a minimum PT, you may have added some of your criteria on the ID conditions like jet energy fractions. And below all of the assignments of the of the four vector branches, we we pull out from the B tags collection here. So there's link B tags to B tags. We're basically using an operator to give it an iterator in our in our list of jets. And then we're asking for the second item in this association. So we're asking for the float that belonged to that particular jet. And so we can investigate this. If you have a file from yesterday. If you have a file from yesterday, we can actually look at this at this branch. So I'm cheating apologize, I'm on Windows. I'm so I'm cheating. And I'm logged into CERN. And so that gives me the ability to have X windows. I didn't get one of the workarounds for for doing plotting. On Windows workings, so if you're in that boat with me, I will try to pull up a plot here, and we will look at it together. So inside just in case, this is your first time or it's too long since yesterday, when we open the file like this root has called it underscore file zero, we can look and see what is in here. And what we have is a directory. So we need to CD into that directory. Transcribed by https://otter.ai