PageRenderTime 62ms CodeModel.GetById 27ms RepoModel.GetById 0ms app.codeStats 0ms

/notebooks/python_en/homework2_correction.ipynb

https://gitlab.com/gvallverdu/cours-python
Jupyter | 341 lines | 341 code | 0 blank | 0 comment | 0 complexity | 848e30fe6d86dc9e26719c8f1cb0edcd MD5 | raw file
  1. {
  2. "cells": [
  3. {
  4. "cell_type": "markdown",
  5. "metadata": {},
  6. "source": [
  7. "# Correction of homework 2: read in data\n",
  8. "\n",
  9. "The aim of this second homework is to validate the part concerning the python syntax.\n",
  10. "\n",
  11. "## Instructions\n",
  12. "\n",
  13. "You have to:\n",
  14. "\n",
  15. "* Verify that you have mastered the notions discussed in the notebook concerning the reading of data.\n",
  16. "* Download the file climat_perpignan.csv\n",
  17. "* Read in the file, the maximum temperature and sunshine.\n",
  18. "\n",
  19. "Two parts are requested :\n",
  20. "\n",
  21. "* Read the file by yourself by browsing it line by line\n",
  22. "* Use the read_csv() function of the pandas module to read the table.\n",
  23. "\n",
  24. "## Part 1: by hand"
  25. ]
  26. },
  27. {
  28. "cell_type": "code",
  29. "execution_count": 1,
  30. "metadata": {},
  31. "outputs": [
  32. {
  33. "name": "stdout",
  34. "output_type": "stream",
  35. "text": [
  36. "Janvier 12.400 141.200\n",
  37. "Février 13.200 160.800\n",
  38. "Mars 16.000 209.600\n",
  39. "Avril 18.200 218.000\n",
  40. "Mai 21.800 235.800\n",
  41. "Juin 26.200 268.900\n",
  42. "Juillet 29.200 298.200\n",
  43. "Août 28.900 267.400\n",
  44. "Septembre 25.400 222.200\n",
  45. "Octobre 21.000 167.600\n",
  46. "Novembre 15.900 149.200\n",
  47. "Décembre 13.100 126.100\n"
  48. ]
  49. }
  50. ],
  51. "source": [
  52. "# open the file in read mode \"r\"\n",
  53. "with open(\"climat_perpignan.csv\", \"r\") as f:\n",
  54. " \n",
  55. " # initialization of lists to record the values\n",
  56. " max_temperature = list()\n",
  57. " sunshine = list()\n",
  58. " \n",
  59. " # read the two first lines\n",
  60. " f.readline()\n",
  61. " f.readline()\n",
  62. " \n",
  63. " # loop over the lines of the file\n",
  64. " for line in f:\n",
  65. " # cut the line according to the semi-column ;\n",
  66. " values = line.split(\";\")\n",
  67. " \n",
  68. " # save the value on column 2 and 4\n",
  69. " # replace , by . using the replace method\n",
  70. " # convert into float\n",
  71. " tmax = float(values[2].replace(\",\", \".\"))\n",
  72. " sun = float(values[4].replace(\",\", \".\"))\n",
  73. " \n",
  74. " # store the values in lists\n",
  75. " max_temperature.append(tmax)\n",
  76. " sunshine.append(sun)\n",
  77. " \n",
  78. " # print the values\n",
  79. " month = values[0]\n",
  80. " print(f\"{month:12s} {tmax:8.3f} {sun:8.3f}\")\n",
  81. " "
  82. ]
  83. },
  84. {
  85. "cell_type": "markdown",
  86. "metadata": {},
  87. "source": [
  88. "You can print the contain of each list\n",
  89. "\n",
  90. "##### maximal temperature"
  91. ]
  92. },
  93. {
  94. "cell_type": "code",
  95. "execution_count": 2,
  96. "metadata": {},
  97. "outputs": [
  98. {
  99. "data": {
  100. "text/plain": [
  101. "[12.4, 13.2, 16.0, 18.2, 21.8, 26.2, 29.2, 28.9, 25.4, 21.0, 15.9, 13.1]"
  102. ]
  103. },
  104. "execution_count": 2,
  105. "metadata": {},
  106. "output_type": "execute_result"
  107. }
  108. ],
  109. "source": [
  110. "max_temperature"
  111. ]
  112. },
  113. {
  114. "cell_type": "markdown",
  115. "metadata": {},
  116. "source": [
  117. "##### Sunshine"
  118. ]
  119. },
  120. {
  121. "cell_type": "code",
  122. "execution_count": 3,
  123. "metadata": {},
  124. "outputs": [
  125. {
  126. "data": {
  127. "text/plain": [
  128. "[141.2,\n",
  129. " 160.8,\n",
  130. " 209.6,\n",
  131. " 218.0,\n",
  132. " 235.8,\n",
  133. " 268.9,\n",
  134. " 298.2,\n",
  135. " 267.4,\n",
  136. " 222.2,\n",
  137. " 167.6,\n",
  138. " 149.2,\n",
  139. " 126.1]"
  140. ]
  141. },
  142. "execution_count": 3,
  143. "metadata": {},
  144. "output_type": "execute_result"
  145. }
  146. ],
  147. "source": [
  148. "sunshine"
  149. ]
  150. },
  151. {
  152. "cell_type": "markdown",
  153. "metadata": {},
  154. "source": [
  155. "## Part 2: read in the file with pandas\n",
  156. "\n",
  157. "First, you have to import the pandas module:"
  158. ]
  159. },
  160. {
  161. "cell_type": "code",
  162. "execution_count": 4,
  163. "metadata": {},
  164. "outputs": [],
  165. "source": [
  166. "import pandas as pd"
  167. ]
  168. },
  169. {
  170. "cell_type": "markdown",
  171. "metadata": {},
  172. "source": [
  173. "We now use pandas' [`read_csv()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html) function to read the file. Here are the elements we have to control to read the :\n",
  174. "\n",
  175. "* We give the name of the file\n",
  176. "* The separator is a semicolon => `sep`.\n",
  177. "* Columns 0, 2 and 4 are used => `usecols`.\n",
  178. "* Skip the first line => `skiprows`.\n",
  179. "* Decimal numbers are written with a comma => `decimal`.\n",
  180. "* The first column is used as index => `index_col`."
  181. ]
  182. },
  183. {
  184. "cell_type": "code",
  185. "execution_count": 5,
  186. "metadata": {},
  187. "outputs": [],
  188. "source": [
  189. "df = pd.read_csv(\n",
  190. " \"climat_perpignan.csv\", \n",
  191. " sep=\";\", \n",
  192. " usecols=(0, 2, 4), \n",
  193. " skiprows=1, \n",
  194. " decimal=\",\", \n",
  195. " index_col=0\n",
  196. ")"
  197. ]
  198. },
  199. {
  200. "cell_type": "code",
  201. "execution_count": 6,
  202. "metadata": {},
  203. "outputs": [
  204. {
  205. "data": {
  206. "text/html": [
  207. "<div>\n",
  208. "<style scoped>\n",
  209. " .dataframe tbody tr th:only-of-type {\n",
  210. " vertical-align: middle;\n",
  211. " }\n",
  212. "\n",
  213. " .dataframe tbody tr th {\n",
  214. " vertical-align: top;\n",
  215. " }\n",
  216. "\n",
  217. " .dataframe thead th {\n",
  218. " text-align: right;\n",
  219. " }\n",
  220. "</style>\n",
  221. "<table border=\"1\" class=\"dataframe\">\n",
  222. " <thead>\n",
  223. " <tr style=\"text-align: right;\">\n",
  224. " <th></th>\n",
  225. " <th>Température maximale</th>\n",
  226. " <th>Durée d'ensoleillement (h)</th>\n",
  227. " </tr>\n",
  228. " </thead>\n",
  229. " <tbody>\n",
  230. " <tr>\n",
  231. " <th>Janvier</th>\n",
  232. " <td>12.4</td>\n",
  233. " <td>141.2</td>\n",
  234. " </tr>\n",
  235. " <tr>\n",
  236. " <th>Février</th>\n",
  237. " <td>13.2</td>\n",
  238. " <td>160.8</td>\n",
  239. " </tr>\n",
  240. " <tr>\n",
  241. " <th>Mars</th>\n",
  242. " <td>16.0</td>\n",
  243. " <td>209.6</td>\n",
  244. " </tr>\n",
  245. " <tr>\n",
  246. " <th>Avril</th>\n",
  247. " <td>18.2</td>\n",
  248. " <td>218.0</td>\n",
  249. " </tr>\n",
  250. " <tr>\n",
  251. " <th>Mai</th>\n",
  252. " <td>21.8</td>\n",
  253. " <td>235.8</td>\n",
  254. " </tr>\n",
  255. " <tr>\n",
  256. " <th>Juin</th>\n",
  257. " <td>26.2</td>\n",
  258. " <td>268.9</td>\n",
  259. " </tr>\n",
  260. " <tr>\n",
  261. " <th>Juillet</th>\n",
  262. " <td>29.2</td>\n",
  263. " <td>298.2</td>\n",
  264. " </tr>\n",
  265. " <tr>\n",
  266. " <th>Août</th>\n",
  267. " <td>28.9</td>\n",
  268. " <td>267.4</td>\n",
  269. " </tr>\n",
  270. " <tr>\n",
  271. " <th>Septembre</th>\n",
  272. " <td>25.4</td>\n",
  273. " <td>222.2</td>\n",
  274. " </tr>\n",
  275. " <tr>\n",
  276. " <th>Octobre</th>\n",
  277. " <td>21.0</td>\n",
  278. " <td>167.6</td>\n",
  279. " </tr>\n",
  280. " <tr>\n",
  281. " <th>Novembre</th>\n",
  282. " <td>15.9</td>\n",
  283. " <td>149.2</td>\n",
  284. " </tr>\n",
  285. " <tr>\n",
  286. " <th>Décembre</th>\n",
  287. " <td>13.1</td>\n",
  288. " <td>126.1</td>\n",
  289. " </tr>\n",
  290. " </tbody>\n",
  291. "</table>\n",
  292. "</div>"
  293. ],
  294. "text/plain": [
  295. " Température maximale Durée d'ensoleillement (h)\n",
  296. "Janvier 12.4 141.2\n",
  297. "Février 13.2 160.8\n",
  298. "Mars 16.0 209.6\n",
  299. "Avril 18.2 218.0\n",
  300. "Mai 21.8 235.8\n",
  301. "Juin 26.2 268.9\n",
  302. "Juillet 29.2 298.2\n",
  303. "Août 28.9 267.4\n",
  304. "Septembre 25.4 222.2\n",
  305. "Octobre 21.0 167.6\n",
  306. "Novembre 15.9 149.2\n",
  307. "Décembre 13.1 126.1"
  308. ]
  309. },
  310. "execution_count": 6,
  311. "metadata": {},
  312. "output_type": "execute_result"
  313. }
  314. ],
  315. "source": [
  316. "df"
  317. ]
  318. }
  319. ],
  320. "metadata": {
  321. "kernelspec": {
  322. "display_name": "Python 3",
  323. "language": "python",
  324. "name": "python3"
  325. },
  326. "language_info": {
  327. "codemirror_mode": {
  328. "name": "ipython",
  329. "version": 3
  330. },
  331. "file_extension": ".py",
  332. "mimetype": "text/x-python",
  333. "name": "python",
  334. "nbconvert_exporter": "python",
  335. "pygments_lexer": "ipython3",
  336. "version": "3.8.5"
  337. }
  338. },
  339. "nbformat": 4,
  340. "nbformat_minor": 2
  341. }