1 00:00:01,000 --> 00:00:01,870 Hello. 2 00:00:01,870 --> 00:00:04,080 And welcome to Python for Informatics. 3 00:00:04,080 --> 00:00:05,630 Right now we're going to cover Chapter One. 4 00:00:05,630 --> 00:00:07,650 I'm Charles Severance from the University of Michigan. 5 00:00:07,650 --> 00:00:09,190 And I'm the author. 6 00:00:09,190 --> 00:00:13,590 And I'll be your lecturer, for this online lecture of the first chapter of the book. 7 00:00:15,570 --> 00:00:19,910 This lecture and my slides, and the book, as a matter of fact, are all open. 8 00:00:19,910 --> 00:00:21,930 Open content, open materials. 9 00:00:23,380 --> 00:00:25,520 They're copyright Creative Commons Attribution. 10 00:00:25,520 --> 00:00:29,510 And this video recording is also copyright Creative Commons Attribution. 11 00:00:29,510 --> 00:00:31,440 It's important to be explicit about copyright, and 12 00:00:31,440 --> 00:00:33,630 so I say it right at the beginning. 13 00:00:35,690 --> 00:00:39,580 So, computers basically want to be helpful. 14 00:00:39,580 --> 00:00:41,255 They are programmed. 15 00:00:41,255 --> 00:00:44,940 [SOUND] Matter of fact, this is a microprocessor. 16 00:00:44,940 --> 00:00:47,260 This is really just an electrical part. 17 00:00:47,260 --> 00:00:50,280 It's got wires and circuits inside of it. 18 00:00:50,280 --> 00:00:55,190 And somebody spent a lot of engineering time, to make it so that these pins in the 19 00:00:55,190 --> 00:01:00,580 back take instructions from us, from operating systems, 20 00:01:00,580 --> 00:01:02,960 from the hard drive, from the memory. 21 00:01:02,960 --> 00:01:05,330 Instructions come into here and then results come out. 22 00:01:05,330 --> 00:01:07,710 It's really sort of a very programmable hand 23 00:01:07,710 --> 00:01:10,690 calculator and it's our job to put instructions in. 24 00:01:10,690 --> 00:01:13,950 This thing, in a sense, is wired to 25 00:01:13,950 --> 00:01:17,570 be curious about what's next, right? It's like 26 00:01:18,900 --> 00:01:21,260 It, it's, like, tell me what you want to do next. 27 00:01:21,260 --> 00:01:22,170 What do you want to do next? 28 00:01:22,170 --> 00:01:24,510 What do you want to do next, and after that, what do you want to do next, 29 00:01:24,510 --> 00:01:27,950 and it just happens to do that a billion or so times a second. 30 00:01:27,950 --> 00:01:32,120 And, so that's sort of the, the low-level piece, and, but you can also think, if 31 00:01:32,120 --> 00:01:35,350 you have, like, a PDA, something like this, 32 00:01:35,350 --> 00:01:38,690 all the buttons on here are some kind of, 33 00:01:38,690 --> 00:01:40,580 you know, "what's next?" 34 00:01:40,580 --> 00:01:41,290 Right? 35 00:01:41,290 --> 00:01:43,090 Each of those is sort of something begging 36 00:01:43,090 --> 00:01:46,620 for my attention, some application developer who's built a 37 00:01:46,620 --> 00:01:49,940 really cool application and says please use me, please 38 00:01:49,940 --> 00:01:52,510 click me, I am sort of nothing without you. 39 00:01:52,510 --> 00:01:56,460 We humans are the things that sort of cause computers to start doing 40 00:01:56,460 --> 00:02:00,976 something and this will sit here happily until I caused it to do something. 41 00:02:00,976 --> 00:02:04,370 Now, [SOUND] whoa! 42 00:02:04,370 --> 00:02:05,220 Whoa. 43 00:02:05,220 --> 00:02:06,950 Hope it's still okay. 44 00:02:06,950 --> 00:02:08,480 Yeah, it seems to be fine. 45 00:02:08,480 --> 00:02:09,420 Seems to be fine. 46 00:02:09,420 --> 00:02:11,080 Takes a lickin', and keeps on tickin'. 47 00:02:11,080 --> 00:02:14,790 So, these, anyone can use, right? 48 00:02:14,790 --> 00:02:18,800 They say even animals can use a Macintosh, smartphone. 49 00:02:18,800 --> 00:02:21,070 And so you don't have to be a programmer. 50 00:02:21,070 --> 00:02:24,580 But to get this to do what you want 51 00:02:24,580 --> 00:02:27,030 you need to learn a different language, and we need to 52 00:02:27,030 --> 00:02:30,940 learn the language of the instructions to tell it what to do. 53 00:02:30,940 --> 00:02:32,410 So, that's what we're going to do. 54 00:02:32,410 --> 00:02:34,420 We're going to learn how to talk to this. 55 00:02:34,420 --> 00:02:35,420 Yo! 56 00:02:35,420 --> 00:02:38,320 because it's asking us a question, we have to give the answer. 57 00:02:40,870 --> 00:02:42,770 So, what's a programmer? 58 00:02:42,770 --> 00:02:44,490 A programmer is somebody who 59 00:02:44,490 --> 00:02:49,470 writes a program, which is a script or a set of instructions that tell 60 00:02:49,470 --> 00:02:53,810 one of these kinds of things what it is that they're supposed to do. 61 00:02:53,810 --> 00:02:57,870 And sometimes you're running a program like Moodle, an open source 62 00:02:57,870 --> 00:02:59,550 learning management system, or Sakai, 63 00:02:59,550 --> 00:03:02,010 another open source learning management system. 64 00:03:02,010 --> 00:03:04,330 And sometimes you'll even get paid to do that, right? 65 00:03:04,330 --> 00:03:05,800 Sometimes you do it for free, sometimes you 66 00:03:05,800 --> 00:03:08,500 get paid, sometimes you write things for yourself. 67 00:03:10,050 --> 00:03:12,900 And and but, if you think about it, all these 68 00:03:12,900 --> 00:03:16,690 applications on my iPhone, somebody's making some money off of these. 69 00:03:16,690 --> 00:03:19,430 They may not be able to quit their job, but a surprising number 70 00:03:19,430 --> 00:03:20,990 have been able to quit their job or start 71 00:03:20,990 --> 00:03:24,170 small companies, maybe not gigantic companies, but small companies. 72 00:03:26,380 --> 00:03:29,840 So these people that can put applications inside of our computers 73 00:03:29,840 --> 00:03:35,160 are programmers, because they understand the way that we talk to these computers. 74 00:03:35,160 --> 00:03:38,920 And part of what I'm going to try to do is to get you to move from 75 00:03:38,920 --> 00:03:43,850 the mindset of the end user who thinks of this as something just to click on 76 00:03:43,850 --> 00:03:46,500 to the mindset of the programmer, who's kind of 77 00:03:46,500 --> 00:03:49,680 on the inside trying to get out to you. 78 00:03:49,680 --> 00:03:51,870 So that's, as we sort of move from user to 79 00:03:51,870 --> 00:03:55,090 programmer, we move from outside to inside and we think 80 00:03:55,090 --> 00:03:56,420 of the world out there and it's like what are 81 00:03:56,420 --> 00:03:59,100 they going to put, push, what button are they going to push. 82 00:03:59,100 --> 00:04:00,450 So here's kind of a picture of that. 83 00:04:01,800 --> 00:04:04,550 So on the outside, we're users, we click on buttons, we click 84 00:04:04,550 --> 00:04:08,600 on websites, we click on buttons on our phones, et cetera, et cetera. 85 00:04:08,600 --> 00:04:12,980 But what's really going on inside all of that is there's a computer with 86 00:04:12,980 --> 00:04:17,810 a bunch of hardware inside of that, and it has inside of it data, 87 00:04:17,810 --> 00:04:24,490 networks, other information. And software is what makes all that make sense. 88 00:04:25,660 --> 00:04:28,550 And so, part of what I want you to do is I want you to stop 89 00:04:28,550 --> 00:04:30,420 thinking about how to use these things from 90 00:04:30,420 --> 00:04:34,380 the outside, and we move to becoming a programmer. 91 00:04:34,380 --> 00:04:35,610 We're someone on the inside. 92 00:04:35,610 --> 00:04:39,410 We're with the CPU, we are with the memory. 93 00:04:39,410 --> 00:04:41,370 We are with the network connection. 94 00:04:41,370 --> 00:04:45,600 We are doing things on behalf of the user, and presenting them back up to the user. 95 00:04:47,870 --> 00:04:49,580 So, why be a programmer? 96 00:04:49,580 --> 00:04:52,850 Now, this class is specifically not trying to turn you 97 00:04:52,850 --> 00:04:55,650 into a professional programmer, even though I'd be very proud 98 00:04:55,650 --> 00:04:58,180 if after five, ten more classes you were a 99 00:04:58,180 --> 00:05:01,530 professional programmer. But that's not the purpose of this class. 100 00:05:01,530 --> 00:05:03,320 Sometimes you just want to get something done. 101 00:05:03,320 --> 00:05:06,860 You got an Excel spreadsheet at work and the data's not right. 102 00:05:06,860 --> 00:05:08,750 You got the data from somebody else and it's 103 00:05:08,750 --> 00:05:12,240 got like extra spaces where it shouldn't have it or 104 00:05:12,240 --> 00:05:17,120 does missing fields or something, you gotta do something to it that Excel can't do. 105 00:05:17,120 --> 00:05:18,640 And you're, you're stuck, like, saying, Aw, I 106 00:05:18,640 --> 00:05:20,920 want to, I want to mess with this data and 107 00:05:20,920 --> 00:05:23,620 put it in Excel so I can do my job, but it's a pain in the neck. 108 00:05:23,620 --> 00:05:26,960 And I have to sit and bring it into a text editor like Microsoft Word 109 00:05:26,960 --> 00:05:30,530 and go line by line and make all kinds of mistakes and clean the data up. 110 00:05:30,530 --> 00:05:31,960 You can write a program to do that, 111 00:05:31,960 --> 00:05:33,320 and that's the kind of programs we're going to do. 112 00:05:33,320 --> 00:05:37,810 Programs that serve our needs inside the computer, but to serve our needs. 113 00:05:37,810 --> 00:05:42,690 Professional programmers tend to build things for other people to use, right? 114 00:05:42,690 --> 00:05:44,650 They, they tend to build things that everyone else does. 115 00:05:44,650 --> 00:05:49,480 But we're going to build stuff primarily for ourselves. 116 00:05:49,480 --> 00:05:50,720 So. 117 00:05:50,720 --> 00:05:51,820 What is code? 118 00:05:51,820 --> 00:05:52,570 What is software? 119 00:05:52,570 --> 00:05:55,660 We use these words pretty much independently, a program. 120 00:05:55,660 --> 00:05:58,530 It's really a sequence of stored instructions. 121 00:05:58,530 --> 00:06:01,790 We learn the language that this talks, and then 122 00:06:01,790 --> 00:06:05,110 we will feed the instructions in, one at a time. 123 00:06:05,110 --> 00:06:08,720 It takes them one at a time, it gives us back a result, we give it the next one. 124 00:06:08,720 --> 00:06:11,020 It gives it back, in, out, in, out. 125 00:06:11,020 --> 00:06:13,030 So it's really a sequence of stored instructions. 126 00:06:13,030 --> 00:06:15,890 But it's kind of more than that. 127 00:06:15,890 --> 00:06:19,290 It's, it's sort of like our creativity, and if you've been using some of 128 00:06:19,290 --> 00:06:21,040 my software, like my MOOC software, I 129 00:06:21,040 --> 00:06:23,980 spent about a month writing all that stuff. 130 00:06:23,980 --> 00:06:25,700 And it's like, it's me. 131 00:06:25,700 --> 00:06:29,070 I mean I'm, it's my vision of how cool stuff ought to work, right? 132 00:06:29,070 --> 00:06:32,380 And so it's more than just getting something done. 133 00:06:32,380 --> 00:06:34,930 It's also a sense of pride and a sense of accomplishment. 134 00:06:34,930 --> 00:06:37,580 Especially if you're giving something that other people can make use of. 135 00:06:37,580 --> 00:06:40,090 It's really, I think it's very creative. 136 00:06:40,090 --> 00:06:42,580 And it's what attracted me to being a programmer in the first place. 137 00:06:42,580 --> 00:06:46,230 It's that I could, I could leverage the abilities inside of here, 138 00:06:46,230 --> 00:06:49,790 and I could do things, cool things, on behalf of the user. 139 00:06:49,790 --> 00:06:54,570 So, code, software, a program. 140 00:06:54,570 --> 00:06:58,720 So, let's get a non-technical example of this. 141 00:07:00,270 --> 00:07:03,880 So, I'll have you link out to the YouTube [COUGH] for this. 142 00:07:03,880 --> 00:07:06,170 This is the Macarena. 143 00:07:06,170 --> 00:07:09,700 The Macarena is a song that 144 00:07:09,700 --> 00:07:11,750 has with it a well-known dance that everyone 145 00:07:11,750 --> 00:07:14,730 seems to know, or either get taught very quickly. 146 00:07:14,730 --> 00:07:18,993 So I'll, I'll stop and let you watch the Macarena, and then come back. 147 00:07:18,993 --> 00:07:22,016 [BLANK_AUDIO] 148 00:07:22,016 --> 00:07:23,590 So welcome back. 149 00:07:23,590 --> 00:07:24,680 I hope you enjoyed that. 150 00:07:25,910 --> 00:07:28,600 In a sense what we've got there is 151 00:07:28,600 --> 00:07:29,580 a program. 152 00:07:29,580 --> 00:07:30,830 A program for human beings. 153 00:07:31,868 --> 00:07:33,749 And, maybe you learned that at a club or 154 00:07:33,749 --> 00:07:36,160 something, and they told you what to do next. 155 00:07:36,160 --> 00:07:37,740 Well, I can teach you how to do 156 00:07:37,740 --> 00:07:41,160 the Macarena by writing a simple program right now. 157 00:07:41,160 --> 00:07:42,210 So here's my Macarena. 158 00:07:43,740 --> 00:07:45,990 While the music plays, means you do it over and over 159 00:07:45,990 --> 00:07:48,570 and over again, to the beat, that's kind of like computers. 160 00:07:48,570 --> 00:07:49,620 They do things in a beat. 161 00:07:49,620 --> 00:07:52,000 They happen to have three billion beats a second. 162 00:07:52,000 --> 00:07:55,300 But as it were, so we're going to do this multiple times so 163 00:07:55,300 --> 00:07:58,180 we have this whole group of instructions that we're going to do, right? 164 00:07:59,270 --> 00:08:05,130 Left hand out and up, right hand out and up, flip left hand, flip right hand, left 165 00:08:05,130 --> 00:08:11,960 hand to right shoulder, right hand to left shoulder, et cetera, et cetera. 166 00:08:11,960 --> 00:08:17,170 Now this particular little program has a mistake in it. 167 00:08:17,170 --> 00:08:18,350 Actually several. 168 00:08:18,350 --> 00:08:22,141 I want you to look and see if you can find the mistakes in the program. 169 00:08:22,141 --> 00:08:29,093 [BLANK_AUDIO] 170 00:08:29,093 --> 00:08:30,000 Okay. 171 00:08:30,000 --> 00:08:33,480 So here are the places that have the mistake. 172 00:08:33,480 --> 00:08:34,040 Right? 173 00:08:34,040 --> 00:08:34,850 The mistake is 174 00:08:37,540 --> 00:08:43,910 right ham to the back of the head, and left hand to right hit, not hip. 175 00:08:43,910 --> 00:08:47,790 Now if you're in a bar and you take a ham and you hit somebody 176 00:08:47,790 --> 00:08:52,360 in the back of the head, that's not very nice when you're dancing to this song. 177 00:08:52,360 --> 00:08:54,270 These are what's called bugs. 178 00:08:54,270 --> 00:08:56,550 Now a human reading this 179 00:08:56,550 --> 00:09:00,910 would say, oh, I think they meant to say "hand" here. 180 00:09:00,910 --> 00:09:03,380 But a computer is much more literal than people. 181 00:09:03,380 --> 00:09:06,530 We'll see a couple of exercises where we'll see that people 182 00:09:06,530 --> 00:09:11,780 can correct little mistakes like this, but computers they cannot, right? 183 00:09:11,780 --> 00:09:14,560 So we have to fix these bugs and we have to say 184 00:09:14,560 --> 00:09:16,750 right hand and we have to say hip when we mean hip. 185 00:09:16,750 --> 00:09:18,290 So we have to be explicit. 186 00:09:18,290 --> 00:09:20,680 Computers do exactly what we say. 187 00:09:20,680 --> 00:09:22,730 They don't do what we mean to do. 188 00:09:22,730 --> 00:09:24,800 So, let's clear that. 189 00:09:24,800 --> 00:09:27,120 Here is another example, okay? 190 00:09:27,120 --> 00:09:28,742 Let's see how this comes out. 191 00:09:28,742 --> 00:09:34,192 [SOUND] 192 00:09:34,192 --> 00:09:34,518 [MUSIC] 193 00:09:34,518 --> 00:09:37,158 You're supposed to count the number of 194 00:09:37,158 --> 00:09:40,580 times the word "the" appears in this sentence. 195 00:09:42,240 --> 00:09:43,830 Count it. 196 00:09:43,830 --> 00:09:46,030 And the word "the", how many times? 197 00:09:46,030 --> 00:09:53,015 [MUSIC] 198 00:09:53,015 --> 00:09:59,373 Okay, it's your turn. 199 00:09:59,373 --> 00:10:06,402 [BLANK_AUDIO] 200 00:10:06,402 --> 00:10:07,033 Now, here. 201 00:10:07,033 --> 00:10:10,212 [BLANK_AUDIO] 202 00:10:10,212 --> 00:10:12,150 This is not something humans are good at. 203 00:10:12,150 --> 00:10:14,690 I moved it around, I played a little music, I confused you, 204 00:10:14,690 --> 00:10:17,920 I put a picture of a clown car in the upper left-hand corner. 205 00:10:17,920 --> 00:10:20,130 Et cetera, et cetera, et cetera. 206 00:10:20,130 --> 00:10:23,840 Now it turns out that computers, once we tell 207 00:10:23,840 --> 00:10:27,490 them what to do, are very good at concentration. 208 00:10:27,490 --> 00:10:32,720 It can easily go through 30 words and find the most common word. 209 00:10:32,720 --> 00:10:35,250 Or three million words and find the most common word. 210 00:10:35,250 --> 00:10:36,800 And it'll never make a mistake. 211 00:10:36,800 --> 00:10:38,649 We first have to give it a set of instructions. 212 00:10:39,700 --> 00:10:44,670 So, I don't want you to learn this right now, but this is a Python program. 213 00:10:44,670 --> 00:10:49,770 Let's say that I want to teach, let you count words in files, okay? 214 00:10:49,770 --> 00:10:52,510 I say hey, I know how to program Python, I'll 215 00:10:52,510 --> 00:10:54,670 send you an email, and I'll send you this program. 216 00:10:54,670 --> 00:10:57,210 Just stick it into Python, and it will count words for you. 217 00:10:58,310 --> 00:10:58,500 Right? 218 00:10:58,500 --> 00:11:00,540 You've got a million words, million lines in a 219 00:11:00,540 --> 00:11:02,310 file, you want to find the most common word. 220 00:11:03,610 --> 00:11:04,890 And so, so here we go. 221 00:11:04,890 --> 00:11:07,750 So I will send you this file called words.py 222 00:11:07,750 --> 00:11:10,810 I spent a little time, it's a friendly gift to you. 223 00:11:10,810 --> 00:11:12,640 And this is what I type in. Now I'll give 224 00:11:12,640 --> 00:11:15,110 you kind of an outline on what this is going to do. 225 00:11:15,110 --> 00:11:16,630 First thing it's going to do is open a 226 00:11:16,630 --> 00:11:19,480 file and read it, then it's going to split 227 00:11:19,480 --> 00:11:23,790 the lines and files into words based on the spaces. 228 00:11:23,790 --> 00:11:26,000 Then it's going to run through and accumulate 229 00:11:26,000 --> 00:11:29,000 numbers like, you know, this word is one, 230 00:11:29,000 --> 00:11:31,770 this word is one, oh I saw that again so I turn that to two. 231 00:11:31,770 --> 00:11:33,650 That's what this does, it's a loop, it goes 232 00:11:33,650 --> 00:11:36,360 round and round and round, one for each word. 233 00:11:36,360 --> 00:11:38,264 Then what we're going to do is we're 234 00:11:38,264 --> 00:11:40,840 going to write another loop that's going to figure out 235 00:11:40,840 --> 00:11:42,688 which is the most common word by looking 236 00:11:42,688 --> 00:11:45,450 at all those little histograms that we built up. 237 00:11:45,450 --> 00:11:47,590 And then it's going to print those things out at the very end. 238 00:11:48,870 --> 00:11:53,570 And this can certainly do Python words.py and read clown.txt 239 00:11:53,570 --> 00:11:56,730 and tell us that the word "the" occurs seven times. 240 00:11:56,730 --> 00:12:00,290 But, you know, it can go, it can find out that a different thing 241 00:12:00,290 --> 00:12:04,220 has the word "to" and that it occurs sixteen times and it's just as fast. 242 00:12:04,220 --> 00:12:07,690 So yeah, you have to learn a language and you have to tell it what to 243 00:12:07,690 --> 00:12:13,050 do, but once you do it'll do it for a million or a billion words and happily. 244 00:12:13,050 --> 00:12:14,510 So you don't have to do menial work 245 00:12:14,510 --> 00:12:19,585 once you understand the way to instruct the computer to do menial work. 246 00:12:19,585 --> 00:12:22,520 [NOISE]