Librería Numpy

La librería numpy ofrece funciones eficientes para la manipulación y el procesamiento numérico en arrays.

Este tipo de estructuras de almacenamiento númerico no son exactamente listas aunque puedan aparentar ese comportamiento (duck-style). Sus elementos son homogéneos e incluyen operaciones básicas y operaciones más complejas como álgebra lineal.

Numpy forma parte del core de otras librerías como Pandas.

https://numpy.org/doc/stable/index.html

mediante el siguiente magicomand veremos si tenemos la librería numpy instalada (SO requerido: Mac o Linux, o en Google Colab)

[1]:
!pip freeze | grep numpy

numpy==2.1.3
[2]:
import numpy as np

data = np.array([1,0])
print(data[0])
print(type(data[0]))

data = np.array([[1,0],[2,0],[3,0]])
print(data[0][:0])

1
<class 'numpy.int64'>
[]
[3]:
print(data.shape)
print(data.size)
print(data.ndim)
print(data.dtype)
(3, 2)
6
2
int64

Tipos de datos soportados:

  • int: int8, int16, int32, int64

  • uint: uint8, uint16, uint32, uint64

  • bool: Bool

  • float: float16, float32, float64, float128

  • complex: complex64, complex128, complex256

[4]:
data = np.array([[1,0],[2,0]],dtype=np.int32)
data = np.array([[1,0],[2,0]],dtype=np.complex64)
data = np.array([[1,0],[2,0]],dtype=np.float16)
# Conversión
data = np.array(data,dtype=np.uint)
data = data.astype(np.bool_)
print(data)
[[ True False]
 [ True False]]

Generación

No todo es cargar valores. A veces es necesario generar una muestra de puntos aleartoria, un vector neutro, o escala de valores.

[5]:
data = np.array(range(10))
print(data)

data = np.arange(10)
print(data)

data = np.zeros((10,5))
# https://numpy.org/doc/stable/reference/generated/numpy.zeros.html
print(data)

data = np.ones(10)
print(data)
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[6]:
shape = (10,80) # 10 rows x 80 cols
data = np.zeros(shape)
print(data)
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0.]]
[7]:
data = np.linspace(0,1,10)
print(data)
[0.         0.11111111 0.22222222 0.33333333 0.44444444 0.55555556
 0.66666667 0.77777778 0.88888889 1.        ]
[8]:
data = np.logspace(0,2,10) # 10puntos entre 2**0 y 2**2
print(data)
[  1.           1.66810054   2.7825594    4.64158883   7.74263683
  12.91549665  21.5443469   35.93813664  59.94842503 100.        ]
[9]:
data = np.identity(3)
print(data)
print("-"*10)

data = np.eye(3,k=1) # https://numpy.org/doc/stable/reference/generated/numpy.eye.html
print(data)
print("-"*10)

data = np.diag(range(1,4))
print(data)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
----------
[[0. 1. 0.]
 [0. 0. 1.]
 [0. 0. 0.]]
----------
[[1 0 0]
 [0 2 0]
 [0 0 3]]

Generación o sampling aleatorio

[10]:
data = np.random.rand(3,3)
print(data)
print("-"*20)

data = np.random.randint(1,10,size=(2,2))
print(data)
print("-"*20)

data = np.random.randint(1,10,20).reshape(10,2)
print(data)
print("-"*20)

data = np.array(np.random.rand(20)*10,dtype=int).reshape(10,2)
print(data)
[[0.62202049 0.63910252 0.7636803 ]
 [0.82023188 0.7574785  0.66170195]
 [0.16001931 0.43654    0.05985354]]
--------------------
[[7 2]
 [9 2]]
--------------------
[[5 7]
 [3 9]
 [4 9]
 [7 4]
 [7 1]
 [9 8]
 [1 5]
 [3 8]
 [5 5]
 [1 5]]
--------------------
[[2 3]
 [5 1]
 [2 6]
 [2 2]
 [8 5]
 [4 6]
 [4 0]
 [9 5]
 [3 3]
 [3 5]]

Actividades

Actividad 1

Imaginad la siguiente actividad donde tenemos que realizar la siguiente matriz:

array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

tip: https://numpy.org/doc/stable/reference/generated/numpy.repeat.html

[11]:
# TODO ACTIVITY

Actividad 2

Generar la siguiente estructura a partir del array([1, 2, 3])

array([1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3])

tip: https://numpy.org/doc/stable/reference/generated/numpy.tile.html

[12]:
# TODO Activity

Carga y volcado de datos

Entrada y salida

¿Cómo cargar datos de un fichero y cómo salvar resultados?

Siempre hay que considerar el tipo de formato con el que se han guardado los datos. El formato influye en el rendimiento de las operaciones (R/W) y la capacidad de almacenamiento utilizada. - CSV suele contener texto, no es eficiente, pero fácil de ingerir en otras herramientas. ¿Tiene sentido un fichero csv para matrices númericas? https://datos.gob.es/es/catalogo

Carga de datos con Pandas a Numpy (solución no-ideal)

Vamos a utilizar cualquier fichero csv (i.e. altura de gabilos) para tener en una estructura de numpy dichos valores de altura.

[13]:
!cat data/"GALIBOS TUNELES MADRID".csv
idelem;ALTURA MAXIMA;NOMBRE TUNEL;POINT_X;POINT_Y;SENTIDO;COMENTARIO- TIPO GALIBO;LATITUD;LONGITUD
8959;2,95;GLORIETA DE SAN VICENTE (ENTRADA PRINCIPAL) RAMPA DE ACCESO Y VIRGEN DEL PUERTO - PASEO DEL REY;438912,3031;4474612,979;Interior;2 CHAPAS con cadenas final acceso Paseo del Rey;40,41990315;-3,72007491
8954;4;BAILÉN - FERRAZ;439479,3243;4474181,2;E: Bailén;Estructura fija en interior;40,41605513;-3,713352936
8964;4;BAILÉN - FERRAZ;439392,055;4474958,045;E: Ferraz;Estructura fija en interior;40,42304694;-3,714455528
8961;4;BAILÉN - FERRAZ;439482,6262;4474804,58;E: Cuesta de San Vicente;Estructura fija en interior;40,42167106;-3,713373335
8960;4;BAILÉN - FERRAZ;439521,7846;4474874,166;E: Plaza España;Estructura fija en interior;40,42230076;-3,71291839
8962;4;BAILÉN - FERRAZ;439479,502;4474183,125;E: Bailén;CHAPAS con cadenas;40,41607248;-3,713351025
8957;3,9;AVENIDA DEL PLANETARIO BAJO PARQUE ENRIQUE TIERNO GALVÁN;441925,0158;4471575,968;E: Avda Planetario;8 Chapas con cadenas;40,39276008;-3,684290023
8934;3,5;MARÍA DE MOLINA - VELÁZQUEZ - A-2 BAJO LÓPEZ DE HOYOS;441625,8193;4476587,981;E: Po Castellana;BARRA con cadenas;40,4378899;-3,6882751
8935;3,5;MARÍA DE MOLINA - VELÁZQUEZ - A-2 BAJO LÓPEZ DE HOYOS;441618,1062;4476588,674;E: Po Castellana;Exterior (cadenas con chapas pintadas);40,4378956;-3,6883661
8956;3,5;MARÍA DE MOLINA - VELÁZQUEZ - A-2 BAJO LÓPEZ DE HOYOS;442066,6652;4476517,38;E: Velázquez;Exterior (cadenas con chapas pintadas);40,43728473;-3,683071113
8949;4,3;AVENIDA DE PÍO XII - MONFORTE DE LEMOS - SINESIO DELGADO BAJO PASEO DE LA CASTELLANA;442869,3053;4480480,113;E: M-30 (tubo norte);BARRA con cadenas, bajo PMV;40,4730384;-3,6739647
8950;4,5;AVENIDA DE PÍO XII - MONFORTE DE LEMOS - SINESIO DELGADO BAJO PASEO DE LA CASTELLANA;442652,2401;4480464,345;E: Avda Pio XII (tubo norte);BARRA con cadenas, bajo PMV;40,4728814;-3,6765238
8926;3;ALBERTO AGUILERA - SERRANO JOVER BAJO ALBERTO AGUILERA;439558,6536;4475760,808;E: Alberto Aguilera;CHAPAS con cadenas, bajo pórtico;40,4302907;-3,7125681
8927;3;SANTA CRUZ DE MARCENADO BAJO SERRANO JOVER;439300,8839;4475695,577;E: Princesa;CHAPAS con cadenas, bajo pórtico;40,4296843;-3,7156006
8924;2,95;GLORIETA DE SAN VICENTE (ENTRADA PRINCIPAL) RAMPA DE ACCESO Y VIRGEN DEL PUERTO - PASEO DEL REY;438889,0989;4474423,062;E: Virgen del Puerto;CHAPAS con cadenas;40,4181908;-3,7203326
8925;4;GLORIETA DE SAN VICENTE (ENTRADA PRINCIPAL) RAMPA DE ACCESO Y VIRGEN DEL PUERTO - PASEO DEL REY;439101,5316;4474659,531;E: Cta San Vicente;CHAPAS con cadenas;40,4203439;-3,7178517
8955;4,5;AVENIDA DE PÍO XII - MONFORTE DE LEMOS - SINESIO DELGADO BAJO PASEO DE LA CASTELLANA;442746,4367;4480399,381;E: Avda Pio XII (tubo sur) a M30;BARRA con cadenas, bajo PMV;40,47230268;-3,675406786
8936;3;AZCA;441464,0362;4477835,839;E: H Pinzón N-2;Estructura (y barra con cadenas, bajo frontis;40,4491198;-3,6902975
8937;4;AZCA;441463,4128;4477828,772;E: H Pinzón N-1;BARRA con cadenas en frontis;40,4490561;-3,6903042
8938;3;AZCA;441154,5001;4478216,599;E: G Perón C Haya N-2;Portico;40,452528;-3,6939828
8939;4;AZCA;441099,2385;4478264,333;E: G Perón C Haya N-1;Estructura (exterior) con tres placas colgando;40,4529541;-3,6946389
8940;3;AZCA;441158,5465;4478169,3;E: G Perón Entreplanta;Estructura con CHAPAS con cadenas;40,4521022;-3,6939307
8941;3;AZCA;441141,4101;4478105,872;E: G Perón Entreplanta;Estructura con CHAPAS con cadenas;40,4515296;-3,6941269
8942;4;AZCA;441257,2772;4478216,569;E: Orense N-1;Estructura (exterior) con dos placas colgando;40,452535;-3,6927708
8943;3;AZCA;441187,5315;4478215,751;E: Orense N-2;Estructura (exterior) justo antes del frontis;40,4525227;-3,6935932
8944;2,95;AZCA;440986,71;4477883,444;E: G Moscardó N-1;Barra en frontis;40,4495149;-3,6959305
8945;4;AZCA;441064,0107;4477518,421;E: Agustín Betancourt N-1;BARRA;40,4462321;-3,6949851
8946;4;AZCA;441057,0031;4477414,663;E: Agustín Betancourt N-1;BARRA con cadenas, bajo cartelón direccional y Pintado rayas en frontis;40,4452969;-3,6950581
8947;4;AZCA;441080,0131;4477573,299;E: Agustín Betancourt N-2;E: Desde nivel 1 a nivel 2;40,4467276;-3,6948015
8948;3;AZCA;441101,5086;4477639,757;E: Agustín Betancourt N-1;Chapa con cadenas;40,4473278;-3,6945542
8958;2,95;AZCA;441010,4375;4477877,086;E: G Moscardó N-2;BARRA fija en interior;40,44945931;-3,695650115
8965;2,85;AZCA;441111,5793;4477889,948;Salida Basilica;BARRA con cadenas;40,44958235;-3,694458649
8966;2,85;AZCA;441086,6144;4477890,715;Salida Basilica;Chapas con cadenas;40,44958749;-3,694753106
8967;3;AZCA;441204,42;4478036,222;Bajada al nivel 2;BARRA fija;40,45090662;-3,693377419
8970;3;AZCA;441165,3536;4478160,209;E: Orense al nivel 2;BARRA fija;40,45202079;-3,693849585
8971;3;AZCA;441185,137;4478159,361;E: Orense al nivel 2;BARRA fija;40,45201454;-3,693616213
8972;2,85;AZCA;441101,1363;4477889,271;Salida Basilica desde el nivel 2;BARRA fija;40,44957551;-3,69458173
8973;3;AZCA;441241,8723;4477650,129;S: Paseo de la Castellana;Portico;40,44743117;-3,692900051
8968;3;AZCA;441282,369;4477795,824;S: A nivel 1 Hermanos Pizon;Portico;40,44874652;-3,692436
8969;3;AZCA;441292,4601;4477805,21;E: Desde nivel 1 Hermanos Pizon;Portico;40,44883178;-3,692317876
8974;3;AZCA;441294,0581;4478019,345;Zona norte;Chapas con cadenas;40,45076092;-3,692318828
8975;3;AZCA;441261,9311;4478021,55;Zona norte;BARRA fija;40,45077851;-3,692697879
8980;3;AZCA;441212,5241;4478035,8;Bajada al nivel 2;Chapa con cadenas;40,45090339;-3,693281814
8977;2,85;AZCA;441061,9487;4477893,239;Salida Basilica;BARRA fija;40,44960848;-3,695044198
8978;4;AZCA;441323,6363;4477957,233;Salida sentido Maragall;BARRA fija;40,45020348;-3,691964297
8963;4;AZCA;441324,5433;4477967,923;Salida sentido Maragall;BARRA fija;40,45029984;-3,69195459
8976;3;AZCA;441288,5822;4477792,263;E: A zona privada;BARRA fija;40,44871488;-3,692362406
8981;2,4;AZCA;441093,0476;4477691,231;E: Carril en zona privada;BARRA fija;40,4477909;-3,694658743
8928;4;BAILÉN - FERRAZ;439537,71;4474885,986;E: Plaza España;CHAPAS con cadenas;40,4224084;-3,7127318
8929;4;BAILÉN - FERRAZ;439386,8731;4474964,254;E: Ferraz;CHAPAS con cadenas;40,4231025;-3,7145172
8930;4;BAILÉN - FERRAZ;439471,4819;4474795,195;E: Cuesta de San Vicente;CHAPAS con cadenas;40,4215857;-3,7135038
8932;4,3;SANTA MARÍA DE LA CABEZA BAJO GLORIETA DE SANTA MARÍA DE LA CABEZA;441090,1042;4473112,731;E: Sta M Cabeza;Individual, colgado en cadena y barra;40,4065454;-3,6942692
8933;3,5;MARÍA DE MOLINA - VELÁZQUEZ - A-2 BAJO LÓPEZ DE HOYOS;442067,8213;4476529,635;E: Velázquez;Exterior (Barra con con cadenas);40,4373952;-3,6830586
8931;2,7;ATOCHA - TOLEDO - MAYOR BAJO PLAZA MAYOR;440195,4334;4474017,199;E: Atocha. Individual, colgado de cadena;CHAPAS con cadenas, bajo cartelón;40,4146295;-3,7048974
8982;2,95;GLORIETA DE SAN VICENTE (ENTRADA PRINCIPAL) RAMPA DE ACCESO Y VIRGEN DEL PUERTO - PASEO DEL REY;438901,4438;4474616,198;Interior;CHAPAS con cadenas;40,41994508;-3,720215203
8983;2,95;GLORIETA DE SAN VICENTE (ENTRADA PRINCIPAL) RAMPA DE ACCESO Y VIRGEN DEL PUERTO - PASEO DEL REY;438950,9083;4474640,146;Interior;BARRA con cadenas;40,4201315;-3,719622509
8984;2,95;GLORIETA DE SAN VICENTE (ENTRADA PRINCIPAL) RAMPA DE ACCESO Y VIRGEN DEL PUERTO - PASEO DEL REY;438906,4081;4474619,97;Interior;CHAPAS con cadenas;40,41997369;-3,720152995
8985;2,95;GLORIETA DE SAN VICENTE (ENTRADA PRINCIPAL) RAMPA DE ACCESO Y VIRGEN DEL PUERTO - PASEO DEL REY;438897,2408;4474613,054;Interior;BARRA;40,41991808;-3,720266467
8986;1,95;BAILÉN - FERRAZ;439521,7776;4474784,004;E: Senado;Barra fija;40,42148854;-3,712909899
8979;2,4;AZCA;441092,3825;4477690,674;E: A zona privada;BARRA fija;40,44778584;-3,694666534

Recomendaciones No pongais nombres de ficheros con espacios ni con acentos!

[14]:
import pandas as pd
df = pd.read_csv("data/GALIBOS TUNELES MADRID.csv",sep=";")
df.columns
height = df[df.columns[1]]
print(height)
## ACTIVIDAD: Como solucionaomos este error!!!!!!!!

data = np.array(height,np.float32)
print(data)
0     2,95
1        4
2        4
3        4
4        4
5        4
6      3,9
7      3,5
8      3,5
9      3,5
10     4,3
11     4,5
12       3
13       3
14    2,95
15       4
16     4,5
17       3
18       4
19       3
20       4
21       3
22       3
23       4
24       3
25    2,95
26       4
27       4
28       4
29       3
30    2,95
31    2,85
32    2,85
33       3
34       3
35       3
36    2,85
37       3
38       3
39       3
40       3
41       3
42       3
43    2,85
44       4
45       4
46       3
47     2,4
48       4
49       4
50       4
51     4,3
52     3,5
53     2,7
54    2,95
55    2,95
56    2,95
57    2,95
58    1,95
59     2,4
Name: ALTURA MAXIMA, dtype: object
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[14], line 8
      5 print(height)
      6 ## ACTIVIDAD: Como solucionaomos este error!!!!!!!!
----> 8 data = np.array(height,np.float32)
      9 print(data)

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/pandas/core/series.py:1031, in Series.__array__(self, dtype, copy)
    981 """
    982 Return the values as a NumPy array.
    983
   (...)
   1028       dtype='datetime64[ns]')
   1029 """
   1030 values = self._values
-> 1031 arr = np.asarray(values, dtype=dtype)
   1032 if using_copy_on_write() and astype_is_view(values.dtype, arr.dtype):
   1033     arr = arr.view()

ValueError: could not convert string to float: '2,95'
[15]:
# https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html
data = np.loadtxt("data/GALIBOS TUNELES MADRID.csv", delimiter=";", usecols = (1), skiprows=1, converters={1: lambda s:float(str(s.decode()).replace(",","."))})
print(data)
print(data.shape)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[15], line 2, in <lambda>(s)
      1 # https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html
----> 2 data = np.loadtxt("data/GALIBOS TUNELES MADRID.csv", delimiter=";", usecols = (1), skiprows=1, converters={1: lambda s:float(str(s.decode()).replace(",","."))})
      3 print(data)

AttributeError: 'str' object has no attribute 'decode'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[15], line 2
      1 # https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html
----> 2 data = np.loadtxt("data/GALIBOS TUNELES MADRID.csv", delimiter=";", usecols = (1), skiprows=1, converters={1: lambda s:float(str(s.decode()).replace(",","."))})
      3 print(data)
      4 print(data.shape)

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py:1397, in loadtxt(fname, dtype, comments, delimiter, converters, skiprows, usecols, unpack, ndmin, encoding, max_rows, quotechar, like)
   1394 if isinstance(delimiter, bytes):
   1395     delimiter = delimiter.decode('latin1')
-> 1397 arr = _read(fname, dtype=dtype, comment=comment, delimiter=delimiter,
   1398             converters=converters, skiplines=skiprows, usecols=usecols,
   1399             unpack=unpack, ndmin=ndmin, encoding=encoding,
   1400             max_rows=max_rows, quote=quotechar)
   1402 return arr

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py:1036, in _read(fname, delimiter, comment, quote, imaginary_unit, usecols, skiplines, max_rows, converters, ndmin, unpack, dtype, encoding)
   1033     data = _preprocess_comments(data, comments, encoding)
   1035 if read_dtype_via_object_chunks is None:
-> 1036     arr = _load_from_filelike(
   1037         data, delimiter=delimiter, comment=comment, quote=quote,
   1038         imaginary_unit=imaginary_unit,
   1039         usecols=usecols, skiplines=skiplines, max_rows=max_rows,
   1040         converters=converters, dtype=dtype,
   1041         encoding=encoding, filelike=filelike,
   1042         byte_converters=byte_converters)
   1044 else:
   1045     # This branch reads the file into chunks of object arrays and then
   1046     # casts them to the desired actual dtype.  This ensures correct
   1047     # string-length and datetime-unit discovery (like `arr.astype()`).
   1048     # Due to chunking, certain error reports are less clear, currently.
   1049     if filelike:

ValueError: could not convert string '2,95' to float64 at row 0, column 2.

Volviendo a trabajar con datos numéricos

[16]:
data = np.random.uniform(0.01,20.0,size=100000)
print(data[:5])
[13.60310172  7.21002243  9.17874302 14.35091613  7.49760525]
[17]:
f = open("data/tmp.npy","wb") # Writing file, in Binary mode
np.save(f,data)
f.close()

with open("data/tmp2.csv","w") as f2: # Writing but in txt
    for n in data:
        f2.write(str(n)+",")

[18]:
!ls -lih data/tmp*
816827 -rw-r--r-- 1 docs docs 782K Dec  2 09:09 data/tmp.npy
816828 -rw-r--r-- 1 docs docs 1.8M Dec  2 09:09 data/tmp2.csv
[19]:
data = np.random.uniform(0.01,20.0,size=100000)
data2 = np.random.normal(0.3,10,100000)
print(data[:5])
print(data2[:5])

f = open("data/tmp.npy","wb") # Writing file, in Binary mode
np.save(f,data)
np.save(f,data2)
f.close()

print("saved")


with open('data/tmp.npy', 'rb') as f:
    a = np.load(f)
    b = np.load(f)

print(a[:5])
print(b[:5])

[ 2.33963505 12.40730779 12.72202415  3.9764174   1.23036383]
[20.3324238   2.71139231  4.82868098 26.19804349 -0.36605367]
saved
[ 2.33963505 12.40730779 12.72202415  3.9764174   1.23036383]
[20.3324238   2.71139231  4.82868098 26.19804349 -0.36605367]
[20]:
import pickle

with open("data/tmp3.npy",'wb') as f:
    pickle.dump(a, f)
    pickle.dump(b, f)
[ ]:

[21]:
!ls -lih data/tmp*
816827 -rw-r--r-- 1 docs docs 1.6M Dec  2 09:09 data/tmp.npy
816828 -rw-r--r-- 1 docs docs 1.8M Dec  2 09:09 data/tmp2.csv
816829 -rw-r--r-- 1 docs docs 1.6M Dec  2 09:09 data/tmp3.npy
[22]:
f = "data/tmp3.npy"
ca = pickle.load(open(f,"rb"))
print(ca)
cb = pickle.load(open(f,"rb"))
print(cb)

# Qué paso en este punto?
# ¿Cómo podriamos haber guardado ambas variables dentro del mismo fichero?
[ 2.33963505 12.40730779 12.72202415 ...  3.46810861  5.65687752
 16.96507584]
[ 2.33963505 12.40730779 12.72202415 ...  3.46810861  5.65687752
 16.96507584]

Operaciones con series numpy

[23]:
import numpy as np
a = np.array([.0,0.1])
b = np.array([1,1])
print(a+b)
print(a-b)
print(a/b)
print(a*b)
print(2**a)
[1.  1.1]
[-1.  -0.9]
[0.  0.1]
[0.  0.1]
[1.         1.07177346]
[24]:
c = np.array([1,1,1])
print(a+c) #Alerta
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[24], line 2
      1 c = np.array([1,1,1])
----> 2 print(a+c) #Alerta

ValueError: operands could not be broadcast together with shapes (2,) (3,)
[25]:
c = np.array([1,1,1,1]).reshape(2,2)
print(a*c)
print("-"*10)
print(a.dot(c)) #https://numpy.org/doc/stable/reference/generated/numpy.dot.html

[[0.  0.1]
 [0.  0.1]]
----------
[0.1 0.1]
[26]:
# Tensor dot
#https://numpy.org/doc/stable/reference/generated/numpy.tensordot.html#numpy.tensordot
a = np.arange(60.).reshape(3,4,5)
b = np.arange(24.).reshape(4,3,2)
c = np.tensordot(a,b, axes=([1,0],[0,1]))
print(c)
print(c.shape)
[[4400. 4730.]
 [4532. 4874.]
 [4664. 5018.]
 [4796. 5162.]
 [4928. 5306.]]
(5, 2)
[27]:
# Einseum
# https://numpy.org/doc/stable/reference/generated/numpy.einsum.html

a = np.arange(25).reshape(5,5)
np.einsum("ii",a)
[27]:
np.int64(60)

https://en.wikipedia.org/wiki/Kronecker_product

\[a \otimes b\]
[28]:
a = np.arange(1,5).reshape(2,2)
print(a)
b = np.array([0,5,6,7]).reshape(2,2)
print(b)
print("-"*10)
k = np.kron(a,b)
print(k)

[[1 2]
 [3 4]]
[[0 5]
 [6 7]]
----------
[[ 0  5  0 10]
 [ 6  7 12 14]
 [ 0 15  0 20]
 [18 21 24 28]]

Actividad

Implementa con operaciones básicas de numpy la multiplicación de Kron. Compara tiempos de ejecución entre tú versión y la ya implementada.

[29]:
#TODO Activity

Funciones sobre series

[30]:
a = np.array(range(10))
print(np.cos(a))
print(np.exp(a))
print(np.log(a))
[ 1.          0.54030231 -0.41614684 -0.9899925  -0.65364362  0.28366219
  0.96017029  0.75390225 -0.14550003 -0.91113026]
[1.00000000e+00 2.71828183e+00 7.38905610e+00 2.00855369e+01
 5.45981500e+01 1.48413159e+02 4.03428793e+02 1.09663316e+03
 2.98095799e+03 8.10308393e+03]
[      -inf 0.         0.69314718 1.09861229 1.38629436 1.60943791
 1.79175947 1.94591015 2.07944154 2.19722458]
/tmp/ipykernel_3676/928567411.py:4: RuntimeWarning: divide by zero encountered in log
  print(np.log(a))
[31]:
print(np.sum(a))
print(np.cumsum(a))
print(np.mean(a))

print("\n",np.cumprod(a))
print(np.min(a))
print(np.argmax(a))

45
[ 0  1  3  6 10 15 21 28 36 45]
4.5

 [0 0 0 0 0 0 0 0 0 0]
0
9
[32]:
print(a.mean())
print(a.min())
print(a.argmax())
4.5
0
9
[ ]:

[33]:
a = a.reshape(2,5)
print(a)
print("-"*10)
print(np.sum(a,axis=1))
print(np.sum(a,axis=0))
[[0 1 2 3 4]
 [5 6 7 8 9]]
----------
[10 35]
[ 5  7  9 11 13]

Actividades

Actividad. 1

¿Cómo calcular la distancia euclidea entre dos vectores?

\[d{v_1,v_2}=\sqrt{\sum_{k=1}^n(x_{1,k}-x_{2,k})^2}\]
[34]:
v1 = np.arange(1,4)
v2 = np.arange(4,7)
#TODO Activity
# Solucion == 5.196152422706632

Actividad. 2

¿Y calcular la distancia de Manhattan?

\[d{v_1,v_2}= \sum_{k=1}^n \mid x_{1,k} - x_{2,k} \mid\]
[35]:
v1 = np.arange(1,4)
v2 = np.arange(4,7)
#TODO Activity
# Solucion == 9

Restructurando la dimensión de una serie

[36]:
a = np.arange(10)
print(a.shape)
print(a.reshape(2,5))
print(a)

(10,)
[[0 1 2 3 4]
 [5 6 7 8 9]]
[0 1 2 3 4 5 6 7 8 9]
[37]:
a = a.reshape(2,5)
print(a.T)
print("-"*10)
print(np.hstack(a))  # https://numpy.org/doc/stable/reference/generated/numpy.hstack.html

[[0 5]
 [1 6]
 [2 7]
 [3 8]
 [4 9]]
----------
[0 1 2 3 4 5 6 7 8 9]
[38]:
b = np.arange(10,20).reshape(2,5)
print(b)
print("-"*10)
print(np.hstack((a,b))) # axis-1
print(np.vstack((a,b))) # axis-0

[[10 11 12 13 14]
 [15 16 17 18 19]]
----------
[[ 0  1  2  3  4 10 11 12 13 14]
 [ 5  6  7  8  9 15 16 17 18 19]]
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
[39]:
c = np.dstack((a,b)) # axis-2  https://numpy.org/doc/stable/reference/generated/numpy.dstack.html
print(c)
print(c.shape)
[[[ 0 10]
  [ 1 11]
  [ 2 12]
  [ 3 13]
  [ 4 14]]

 [[ 5 15]
  [ 6 16]
  [ 7 17]
  [ 8 18]
  [ 9 19]]]
(2, 5, 2)
[40]:
print(a)
print(np.ravel(a)) # https://numpy.org/doc/stable/reference/generated/numpy.ravel.html

print(np.ravel(a,order="F")) # ‘F’ means to index the elements in column-major,
[[0 1 2 3 4]
 [5 6 7 8 9]]
[0 1 2 3 4 5 6 7 8 9]
[0 5 1 6 2 7 3 8 4 9]
[41]:
print(a)
print(np.split(a,2))
c1,c2 = np.split(a,2)

print("-"*10)

print(c1)
print(c1.shape)
print(np.ravel(c1))
print(c2)
[[0 1 2 3 4]
 [5 6 7 8 9]]
[array([[0, 1, 2, 3, 4]]), array([[5, 6, 7, 8, 9]])]
----------
[[0 1 2 3 4]]
(1, 5)
[0 1 2 3 4]
[[5 6 7 8 9]]
[42]:
print(np.concatenate((a,b)))
print("-"*10)
print(np.concatenate((a,b),axis=1))
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
----------
[[ 0  1  2  3  4 10 11 12 13 14]
 [ 5  6  7  8  9 15 16 17 18 19]]
[43]:
# https://pillow.readthedocs.io/en/stable/
[44]:
%pip install pillow
Collecting pillow
  Downloading pillow-11.0.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (9.1 kB)
Downloading pillow-11.0.0-cp311-cp311-manylinux_2_28_x86_64.whl (4.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.4/4.4 MB 81.6 MB/s eta 0:00:00
Installing collected packages: pillow
Successfully installed pillow-11.0.0
Note: you may need to restart the kernel to use updated packages.
[45]:
from PIL import Image

image = Image.open('images/gatito.jpeg')
# summarize some details about the image
print(image.format)
print(image.size)
print(image.mode)

display(image)

JPEG
(1200, 800)
RGB
../../../_images/notebooks_Part2_01_NumpyNotes_01_numpy_basics_67_1.png
[46]:
data = np.asarray(image)
print(len(data[0]))
print(data.size)
print(data.shape)
w,h,_ = data.shape
1200
2880000
(800, 1200, 3)
[47]:
print(data[w//2,h//2])
data2 = data.copy()
data2[w//2,h//2] = np.array([255,0,0]) # a red point

img = Image.fromarray(data2, 'RGB')

display(img)

[153 143 131]
../../../_images/notebooks_Part2_01_NumpyNotes_01_numpy_basics_69_1.png
[48]:
center_mask = w//2,h//2
rectangle_size = int(w*0.1)
mask = np.ones(rectangle_size*rectangle_size).reshape(rectangle_size,rectangle_size)
print(mask.shape)

for x in range(w//2,w//2+rectangle_size):
    for y in range(h//2,h//2+rectangle_size):
        data2[x,y] =  np.array([255,0,0])


img = Image.fromarray(data2, 'RGB')
display(img)

# REALMENTE, el cuadrado está en el centro?

(80, 80)
../../../_images/notebooks_Part2_01_NumpyNotes_01_numpy_basics_70_1.png

Actividad

Transforma la imagen en tonos grises. Solo con numpy!!!

[49]:
#WAY 1:
#
data = np.asarray(image)
data2 = data.copy()
for x in range(data2.shape[0]): #no eficiente
     for y in range(data2.shape[1]):
         data2[x,y] = np.repeat(data[x,y].mean(),3)

img = Image.fromarray(data2, 'RGB')
display(img)
../../../_images/notebooks_Part2_01_NumpyNotes_01_numpy_basics_72_0.png

Operaciones de Slicing

[50]:
a = np.arange(300).reshape(10,10,3)
print(a[:1])
print("-"*10)

print(a[0][0])
print("-"*10)

print(a[:,0])
print("-"*10)

print(a[:,2:4])
print("-"*10)
[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]
  [ 9 10 11]
  [12 13 14]
  [15 16 17]
  [18 19 20]
  [21 22 23]
  [24 25 26]
  [27 28 29]]]
----------
[0 1 2]
----------
[[  0   1   2]
 [ 30  31  32]
 [ 60  61  62]
 [ 90  91  92]
 [120 121 122]
 [150 151 152]
 [180 181 182]
 [210 211 212]
 [240 241 242]
 [270 271 272]]
----------
[[[  6   7   8]
  [  9  10  11]]

 [[ 36  37  38]
  [ 39  40  41]]

 [[ 66  67  68]
  [ 69  70  71]]

 [[ 96  97  98]
  [ 99 100 101]]

 [[126 127 128]
  [129 130 131]]

 [[156 157 158]
  [159 160 161]]

 [[186 187 188]
  [189 190 191]]

 [[216 217 218]
  [219 220 221]]

 [[246 247 248]
  [249 250 251]]

 [[276 277 278]
  [279 280 281]]]
----------
[51]:
a = np.arange(300).reshape(10,10,3)
print(a[:1])
print("-"*10)

print(a[:,:,0])

print("-"*10)

r = 0.2126
print(a[:,:,0]*r)

[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]
  [ 9 10 11]
  [12 13 14]
  [15 16 17]
  [18 19 20]
  [21 22 23]
  [24 25 26]
  [27 28 29]]]
----------
[[  0   3   6   9  12  15  18  21  24  27]
 [ 30  33  36  39  42  45  48  51  54  57]
 [ 60  63  66  69  72  75  78  81  84  87]
 [ 90  93  96  99 102 105 108 111 114 117]
 [120 123 126 129 132 135 138 141 144 147]
 [150 153 156 159 162 165 168 171 174 177]
 [180 183 186 189 192 195 198 201 204 207]
 [210 213 216 219 222 225 228 231 234 237]
 [240 243 246 249 252 255 258 261 264 267]
 [270 273 276 279 282 285 288 291 294 297]]
----------
[[ 0.      0.6378  1.2756  1.9134  2.5512  3.189   3.8268  4.4646  5.1024
   5.7402]
 [ 6.378   7.0158  7.6536  8.2914  8.9292  9.567  10.2048 10.8426 11.4804
  12.1182]
 [12.756  13.3938 14.0316 14.6694 15.3072 15.945  16.5828 17.2206 17.8584
  18.4962]
 [19.134  19.7718 20.4096 21.0474 21.6852 22.323  22.9608 23.5986 24.2364
  24.8742]
 [25.512  26.1498 26.7876 27.4254 28.0632 28.701  29.3388 29.9766 30.6144
  31.2522]
 [31.89   32.5278 33.1656 33.8034 34.4412 35.079  35.7168 36.3546 36.9924
  37.6302]
 [38.268  38.9058 39.5436 40.1814 40.8192 41.457  42.0948 42.7326 43.3704
  44.0082]
 [44.646  45.2838 45.9216 46.5594 47.1972 47.835  48.4728 49.1106 49.7484
  50.3862]
 [51.024  51.6618 52.2996 52.9374 53.5752 54.213  54.8508 55.4886 56.1264
  56.7642]
 [57.402  58.0398 58.6776 59.3154 59.9532 60.591  61.2288 61.8666 62.5044
  63.1422]]
[52]:
# WAY2
# https://e2eml.school/convert_rgb_to_grayscale.html


# TODO

print(data.shape)
img = Image.fromarray(np.uint8(data))
display(img)


(800, 1200, 3)
../../../_images/notebooks_Part2_01_NumpyNotes_01_numpy_basics_76_1.png
[53]:
# Way 3

rgbcorrection = np.array([0.2989, 0.5870, 0.1140])

data = np.asarray(image)
print(data.shape)
data2 = np.dot(data,rgbcorrection)

print(data2.shape)
img = Image.fromarray(np.uint8(data2))
display(img)


# as a plot
import matplotlib.pyplot as plt
plt.imshow(data2, cmap = plt.get_cmap(name = 'gray'))
plt.show()
(800, 1200, 3)
(800, 1200)
../../../_images/notebooks_Part2_01_NumpyNotes_01_numpy_basics_77_1.png
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[53], line 15
     11 display(img)
     14 # as a plot
---> 15 import matplotlib.pyplot as plt
     16 plt.imshow(data2, cmap = plt.get_cmap(name = 'gray'))
     17 plt.show()

ModuleNotFoundError: No module named 'matplotlib'

Funciones propias vectorizadas

[54]:
temperatura = np.random.randint(-10,43,1000)
[55]:
temperatura[temperatura>33]
[55]:
array([36, 42, 34, 42, 35, 34, 41, 35, 42, 35, 36, 38, 39, 40, 39, 40, 38,
       37, 37, 41, 41, 40, 40, 39, 37, 39, 40, 39, 34, 36, 35, 42, 37, 37,
       39, 34, 36, 34, 41, 40, 37, 41, 35, 41, 38, 38, 36, 36, 41, 35, 42,
       38, 34, 38, 38, 37, 37, 37, 41, 42, 41, 37, 42, 42, 38, 39, 35, 36,
       35, 40, 36, 34, 35, 41, 40, 40, 35, 34, 40, 42, 36, 38, 34, 38, 38,
       41, 35, 36, 40, 40, 35, 36, 42, 38, 36, 40, 38, 38, 40, 37, 40, 39,
       37, 35, 42, 36, 37, 34, 35, 36, 41, 41, 41, 35, 35, 38, 35, 36, 35,
       34, 35, 40, 39, 41, 35, 39, 36, 41, 37, 40, 39, 37, 35, 34, 35, 40,
       40, 37, 39, 37, 38, 36, 40, 36, 42, 39, 41, 42, 34, 36, 37, 35, 39,
       36, 40, 38, 34, 34, 41, 42, 34, 38, 34, 35, 36, 35, 35, 36, 40, 34])
[56]:
def isHot(grados):
    if grados>33 and grados<=40:
        return True

def isHot(grados):
    return grados>33 and grados<=40

fHot = np.vectorize(isHot)
print(fHot(temperatura))
[False False  True False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False  True False False False
 False False False False False False False False False False False  True
 False False False False False  True False False False False False False
 False False False False False False False False False False False  True
 False False False False False False False False False False False  True
 False False False  True  True False False False False False False False
 False False False False False False  True False False False  True False
 False False False False False False False False False False False False
 False False  True  True False False False False False False False  True
 False  True False False False False  True False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False  True
 False False  True False False  True False False  True False False False
 False  True False False False False False  True False  True False  True
 False False  True False False False False False False False  True False
 False False False False False False False False False False False False
 False False  True False False False False  True  True False False False
 False False False False False False False  True False  True False False
 False False False False False  True False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False  True  True False False
 False False False False False False False False False  True False False
 False False False False False False False False False False False False
 False  True False False False False  True False False False False  True
 False False False False False False False False  True False False False
 False False False False False False False  True False False False False
 False False False False False  True False False False  True False False
 False False  True False False False  True False False False False False
  True False  True False False False False False False False False False
 False False False False False False  True False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False  True False False False
 False False False False False False False False False False False False
 False False  True False False  True False  True False False False False
 False False False  True False False  True False False False False  True
  True False False  True False False False False  True False False False
 False False  True False False False False False False  True False False
  True False False False False False False  True False False False False
 False  True False False False False False False  True False False  True
 False False False False False False False False False False  True False
 False False  True False  True False False False False False False False
 False  True False False  True False  True False False  True False False
 False False False False False False False False False False False False
 False False  True False False  True False False False False False False
  True  True False False False  True  True False False False  True False
  True  True False  True  True  True False  True False False False False
 False False False False  True False False False False False False False
 False False False False False False False False False False False  True
 False False False False False False False  True False False False  True
  True False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False  True False False False False
 False  True  True  True False  True False False False False False  True
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False  True False False
  True False False False False False False False False False False  True
  True False False False False False False  True False False False False
 False False False False False  True False False  True False False  True
 False False  True False False False False  True False False  True False
 False False False False  True  True False False False False False False
 False  True False False False  True False False False  True False  True
 False False False False False False False False False False False  True
 False False False  True False False False  True False False  True False
 False False False False  True False False False False False False False
 False False False  True False False False False False  True False False
 False False False False False False False False  True False  True False
 False False False False False False False False False False False False
 False False False False False  True False False False False False False
 False False False False  True False  True False False  True False  True
 False False False False False  True False False  True False False False
  True False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False  True  True False False
 False False False  True False False False  True False False False False
 False False False False  True  True False False False False False False
  True False False False False False False False False  True False  True
 False  True False False]
[57]:
temperatura[fHot(temperatura)]
[57]:
array([36, 34, 35, 34, 35, 35, 36, 38, 39, 40, 39, 40, 38, 37, 37, 40, 40,
       39, 37, 39, 40, 39, 34, 36, 35, 37, 37, 39, 34, 36, 34, 40, 37, 35,
       38, 38, 36, 36, 35, 38, 34, 38, 38, 37, 37, 37, 37, 38, 39, 35, 36,
       35, 40, 36, 34, 35, 40, 40, 35, 34, 40, 36, 38, 34, 38, 38, 35, 36,
       40, 40, 35, 36, 38, 36, 40, 38, 38, 40, 37, 40, 39, 37, 35, 36, 37,
       34, 35, 36, 35, 35, 38, 35, 36, 35, 34, 35, 40, 39, 35, 39, 36, 37,
       40, 39, 37, 35, 34, 35, 40, 40, 37, 39, 37, 38, 36, 40, 36, 39, 34,
       36, 37, 35, 39, 36, 40, 38, 34, 34, 34, 38, 34, 35, 36, 35, 35, 36,
       40, 34])
[58]:
#Alternativa
# https://numpy.org/doc/stable/reference/routines.logic.html
np.logical_and(temperatura>30,temperatura<=40)
[58]:
array([False, False,  True, False, False, False, False, False, False,
       False, False,  True, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False,  True,
       False, False, False, False, False,  True, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False,  True, False, False, False, False,  True,  True,
       False, False, False, False, False,  True, False, False, False,
       False, False, False, False,  True, False, False, False,  True,
       False, False, False, False, False, False, False, False, False,
       False, False,  True, False, False, False,  True,  True, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False,  True, False,  True, False,  True, False,
       False,  True, False, False,  True, False, False, False, False,
       False, False, False, False, False,  True,  True, False, False,
       False, False, False, False, False,  True, False,  True, False,
       False,  True, False,  True, False, False, False, False, False,
       False, False, False, False, False, False, False,  True, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False,  True, False, False,
       False, False, False, False, False, False, False, False,  True,
       False, False, False, False, False, False,  True, False, False,
       False,  True,  True, False, False,  True, False, False,  True,
       False, False,  True, False, False, False, False,  True, False,
       False, False, False, False,  True, False,  True, False,  True,
       False, False,  True, False, False, False, False, False, False,
       False,  True, False, False, False, False, False, False, False,
       False, False, False,  True, False, False, False, False,  True,
       False, False, False, False,  True,  True, False, False, False,
       False, False, False, False, False, False, False,  True, False,
        True, False, False, False, False, False, False,  True,  True,
       False, False, False, False, False,  True,  True, False, False,
       False, False, False, False, False, False, False,  True, False,
       False, False, False,  True,  True, False, False, False,  True,
        True, False, False, False, False, False, False, False, False,
       False, False, False,  True, False, False, False,  True,  True,
       False, False, False, False, False, False, False, False, False,
       False,  True, False, False, False, False,  True, False, False,
       False, False,  True, False, False, False, False, False, False,
       False, False,  True,  True, False, False, False, False, False,
       False, False, False, False,  True, False,  True, False,  True,
        True, False, False, False, False,  True, False, False, False,
        True, False, False, False,  True,  True, False, False, False,
        True, False, False, False, False, False,  True, False,  True,
       False, False, False, False,  True,  True, False, False, False,
       False, False, False, False,  True,  True,  True, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False,  True, False,
       False, False, False, False, False, False,  True, False, False,
       False,  True, False, False, False, False,  True, False,  True,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False,  True,  True,
       False, False,  True, False,  True, False, False, False, False,
       False, False, False,  True, False, False,  True, False, False,
       False, False,  True,  True, False, False,  True, False, False,
       False, False,  True, False, False, False, False, False,  True,
       False, False, False, False, False, False,  True, False,  True,
        True, False, False, False, False, False, False,  True, False,
       False, False, False, False,  True, False, False, False, False,
       False, False,  True, False, False,  True, False, False, False,
       False, False,  True, False, False, False, False,  True, False,
       False, False,  True, False,  True, False, False, False, False,
       False, False, False, False,  True, False,  True,  True, False,
        True, False,  True,  True,  True,  True, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False,  True, False, False,  True, False, False, False,
       False, False, False,  True,  True, False, False, False,  True,
        True, False, False, False,  True,  True,  True,  True, False,
        True,  True,  True, False,  True, False, False, False, False,
       False, False, False, False,  True, False, False, False, False,
       False, False, False,  True, False, False, False, False, False,
       False, False, False, False, False,  True, False,  True, False,
       False, False, False, False,  True, False,  True, False,  True,
        True, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False,  True, False, False, False, False,  True, False,
       False, False, False, False,  True,  True,  True, False,  True,
       False, False, False, False,  True,  True, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False,  True, False, False,  True, False, False,
       False, False, False, False, False, False, False, False,  True,
        True, False, False, False, False, False, False,  True, False,
       False,  True, False, False, False,  True, False, False,  True,
       False,  True,  True, False, False,  True, False, False,  True,
        True, False, False, False,  True, False, False,  True, False,
       False, False, False, False,  True,  True, False, False, False,
       False, False, False, False,  True, False, False, False,  True,
       False, False, False,  True, False,  True, False, False, False,
       False, False, False, False,  True, False, False, False,  True,
       False, False, False,  True, False, False, False,  True, False,
       False,  True, False, False, False, False, False,  True, False,
        True, False, False, False, False, False, False, False, False,
        True, False, False, False, False, False,  True, False, False,
       False, False, False, False, False, False, False, False,  True,
       False,  True, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False,  True, False, False, False, False, False, False,
       False, False, False, False,  True, False,  True, False, False,
        True, False,  True, False, False,  True, False, False,  True,
       False, False,  True, False, False, False,  True, False, False,
        True, False, False,  True, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False,  True, False, False,  True, False, False, False,  True,
       False, False,  True,  True, False, False, False, False, False,
        True, False, False, False,  True, False, False, False, False,
       False, False, False, False,  True,  True, False, False, False,
       False, False, False,  True, False, False, False, False, False,
        True, False, False,  True,  True,  True, False,  True, False,
       False])
[59]:
isHot = lambda x: (x>30 and x<=40)
index = list(map(isHot,temperatura))
temperatura[index]
[59]:
array([36, 31, 32, 34, 35, 33, 34, 31, 33, 35, 35, 36, 38, 39, 33, 40, 32,
       31, 39, 40, 38, 37, 31, 37, 31, 31, 31, 32, 32, 40, 40, 39, 37, 39,
       40, 39, 34, 36, 35, 31, 37, 37, 39, 34, 36, 33, 34, 33, 32, 32, 32,
       32, 40, 37, 35, 32, 32, 38, 38, 36, 36, 32, 35, 32, 33, 31, 38, 34,
       33, 38, 38, 37, 37, 33, 32, 33, 32, 37, 31, 33, 31, 32, 37, 33, 38,
       39, 35, 36, 35, 40, 36, 34, 35, 40, 40, 32, 35, 34, 40, 36, 38, 31,
       34, 38, 38, 35, 32, 36, 40, 31, 40, 32, 32, 35, 36, 38, 36, 40, 38,
       38, 33, 40, 37, 40, 39, 37, 35, 36, 31, 37, 32, 34, 32, 35, 36, 33,
       35, 35, 38, 35, 36, 32, 35, 34, 35, 40, 39, 35, 31, 33, 39, 33, 36,
       37, 40, 32, 39, 37, 35, 34, 35, 40, 40, 37, 31, 39, 37, 38, 36, 40,
       33, 36, 39, 34, 36, 37, 35, 39, 36, 40, 33, 38, 34, 34, 33, 33, 31,
       33, 31, 34, 38, 34, 35, 36, 35, 35, 31, 36, 33, 40, 34])

Más funciones lógicas

[60]:
np.random.seed(2022)
a = np.random.randint(-90,0,100)

index = np.where(a<-80) # Alerta: Son índices
print(index)
print(a[index])
(array([21, 50, 66, 71, 78, 91, 97]),)
[-88 -88 -81 -83 -86 -86 -88]
[61]:
print(np.logical_or(a<-80,a<-90))

[False False False False False False False False False False False False
 False False False False False False False False False  True False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False  True False False False False False False False False False
 False False False False False False  True False False False False  True
 False False False False False False  True False False False False False
 False False False False False False False  True False False False False
 False  True False False]
[62]:
print(np.logical_not(np.logical_or(a<-40,a<-90)))
[False False  True  True False False False  True False False False False
 False  True False False False  True False False False False  True False
 False  True  True  True False  True  True False False  True False False
  True  True False  True  True False  True False False False  True  True
  True False False  True False False  True  True  True  True  True False
 False False False False  True False False  True False False False False
  True  True False  True False  True False  True False  True False  True
  True False False  True  True  True False False False False  True  True
  True False False  True]

Actividades

¿Cómo podemos conseguir esta transformación?

De

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

a

array([ 0, -1,  2, -1,  4, -1,  6, -1,  8, -1])
[ ]:

Operaciones con Grupos

[63]:
np.random.seed(2022)
a = np.random.randint(-30,45,100)
print(a)
[ 15  19  25 -12  -6 -14  23  11   3  -3 -19 -11  18 -11   8  42 -16 -14
 -19 -28 -15  -7   7  26 -18  15 -16 -17  22  12  23  34  15  33  -7   7
  -9  31  26  16 -28  34  16  11  31  31  42  34  21   3  -1  -7  17  19
  25  -2 -21  40  17   1  -4 -23  35  12  32   2 -26 -20  -5  23  32  -7
  -6  34  30   5 -26 -11  19  20  29 -28  -3  41  24 -27  -9  37 -19  15
  37  14  26   3   5  -5  25 -24 -29  -4]
[64]:
9 in a
[64]:
False
[65]:
if -14 in a and not -6 in a:
    print("Something strange")
elif 45 in a:
    print("No 11")
else:
    print("Pues está el -14 y el -6, y no el 45")
Pues está el -14 y el -6, y no el 45
[66]:
#https://numpy.org/doc/stable/reference/generated/numpy.unique.html?highlight=unique#numpy.unique

unique_a = np.unique(a)  # sort but
print(unique_a)
[-29 -28 -27 -26 -24 -23 -21 -20 -19 -18 -17 -16 -15 -14 -12 -11  -9  -7
  -6  -5  -4  -3  -2  -1   1   2   3   5   7   8  11  12  14  15  16  17
  18  19  20  21  22  23  24  25  26  29  30  31  32  33  34  35  37  40
  41  42]
[67]:
unique_a, freq_a = np.unique(a,return_counts=True)
print(a)
print(len(a))
print("-"*10)
print(freq_a)
print(len(freq_a))
print("-"*10)
print(unique_a)
print(len(unique_a))

# ¿Cuántos elementos repetidos hay?
[ 15  19  25 -12  -6 -14  23  11   3  -3 -19 -11  18 -11   8  42 -16 -14
 -19 -28 -15  -7   7  26 -18  15 -16 -17  22  12  23  34  15  33  -7   7
  -9  31  26  16 -28  34  16  11  31  31  42  34  21   3  -1  -7  17  19
  25  -2 -21  40  17   1  -4 -23  35  12  32   2 -26 -20  -5  23  32  -7
  -6  34  30   5 -26 -11  19  20  29 -28  -3  41  24 -27  -9  37 -19  15
  37  14  26   3   5  -5  25 -24 -29  -4]
100
----------
[1 3 1 2 1 1 1 1 3 1 1 2 1 2 1 3 2 4 2 2 2 2 1 1 1 1 3 2 2 1 2 2 1 4 2 2 1
 3 1 1 1 3 1 3 3 1 1 3 2 1 4 1 2 1 1 2]
56
----------
[-29 -28 -27 -26 -24 -23 -21 -20 -19 -18 -17 -16 -15 -14 -12 -11  -9  -7
  -6  -5  -4  -3  -2  -1   1   2   3   5   7   8  11  12  14  15  16  17
  18  19  20  21  22  23  24  25  26  29  30  31  32  33  34  35  37  40
  41  42]
56
[68]:
unique_a,index_a,freq_a = np.unique(a,return_counts=True,return_index=True)
print(freq_a)
print(index_a)
print("-"*10)
print(np.where(freq_a==4))
print(unique_a[17]) # se repite cuatro veces
print(unique_a[33]) # se repite cuatro veces
print(unique_a[50]) # se repite cuatro veces

print(a[np.where(a==-7)])
[1 3 1 2 1 1 1 1 3 1 1 2 1 2 1 3 2 4 2 2 2 2 1 1 1 1 3 2 2 1 2 2 1 4 2 2 1
 3 1 1 1 3 1 3 3 1 1 3 2 1 4 1 2 1 1 2]
[98 19 85 66 97 61 56 67 10 24 27 16 20  5  3 11 36 21  4 68 60  9 55 50
 59 65  8 75 22 14  7 29 91  0 39 52 12  1 79 48 28  6 84  2 23 80 74 37
 64 33 31 62 87 57 83 15]
----------
(array([17, 33, 50]),)
-7
15
34
[-7 -7 -7 -7]
[69]:
np.sort(freq_a)
[69]:
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4])
[70]:

print(np.array([0,4,1,2,5,7,9])[::-1]) print(np.argsort(np.array([0,4,1,2,5,7,9]))) print(np.argsort(np.array([0,4,1,2,5,7,9]))[::-1]) print("-"*10) index_sorted = np.argsort(freq_a)[::-1] #https://numpy.org/doc/stable/reference/generated/numpy.argsort.html print(index_sorted)

[9 7 5 2 1 4 0]
[0 2 3 1 4 5 6]
[6 5 4 1 3 2 0]
----------
[50 33 17 43 44 37 47 41 15  8 26  1 52 48 34 18 31 28 27 35  3 21 16 55
 13 19 20 30 11 39 38 36 53 54 51 49 45 46 42 40 32 22 25 24 29 23 12 14
  9 10  7  6  4  5  2  0]
[71]:
unique_a[index_sorted]
[71]:
array([ 34,  15,  -7,  25,  26,  19,  31,  23, -11, -19,   3, -28,  37,
        32,  16,  -6,  12,   7,   5,  17, -26,  -3,  -9,  42, -14,  -5,
        -4,  11, -16,  21,  20,  18,  40,  41,  35,  33,  29,  30,  24,
        22,  14,  -2,   2,   1,   8,  -1, -15, -12, -18, -17, -20, -21,
       -24, -23, -27, -29])

Actividades

Actividad 1

  • ¿Cuál es el color más frecuente en la imagen del gatito?

  • Sustituye esos pixeles por un color azul: rgb=(0,0,255)

[72]:
from PIL import Image

image = Image.open('images/gatito.jpeg')
#TODO

[73]:
a = np.arange(10)
b = np.arange(5,15)
print(a)
print(b)
print("-"*10)
print(np.in1d(a,b))
print(np.intersect1d(a,b))
print("-"*10)
print(np.setdiff1d(a,b))
print(np.setdiff1d(b,a))
print("-"*10)
print(np.union1d(a,b))
[0 1 2 3 4 5 6 7 8 9]
[ 5  6  7  8  9 10 11 12 13 14]
----------
[False False False False False  True  True  True  True  True]
[5 6 7 8 9]
----------
[0 1 2 3 4]
[10 11 12 13 14]
----------
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]
/tmp/ipykernel_3676/1680166640.py:6: DeprecationWarning: `in1d` is deprecated. Use `np.isin` instead.
  print(np.in1d(a,b))

Actividad 2

¿Cómo podemos conseguir encontrar valores pico, valores mayores sobre sus vecinos?

array([0, 1, 2, 3, 4, 54, 6, 7, 80, 9])
array([5, 8])

Un par de pistas: - np.diff https://numpy.org/doc/stable/reference/generated/numpy.diff.html?highlight=diff#numpy.diff - np.sign https://numpy.org/doc/stable/reference/generated/numpy.sign.html?highlight=sign#numpy.sign -

[74]:
a = np.array([0, 1, 2, 3, 4, 54, 6, 7, 80, 9])
# TODO

Actividad 3

Existe alguna columna o fila que sólo tenga una única incógnita?

sudoku = np.array([[5,3,0,0,7,0,0,0,0],
                 [6,0,0,1,9,5,0,0,0],
                 [1,9,8,0,0,0,0,6,0],
                 [8,0,0,0,6,0,0,0,3],
                 [4,0,0,8,0,3,0,0,1],
                 [7,0,0,0,2,0,0,0,6],
                 [0,6,0,0,0,0,2,8,0],
                 [3,8,0,4,1,9,7,2,5],
                 [4,0,0,0,8,0,0,7,9]])

Funciones de estadística

[75]:
np.random.seed(2022)
temperatures= np.random.normal(loc=17,scale=20,size=1000000)


# https://numpy.org/doc/stable/reference/routines.statistics.html

print(temperatures.mean())

17.001888200806263
[76]:
import matplotlib.pyplot as plot

fig, ax = plot.subplots()
ax.plot(np.sort(temperatures))
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[76], line 1
----> 1 import matplotlib.pyplot as plot
      3 fig, ax = plot.subplots()
      4 ax.plot(np.sort(temperatures))

ModuleNotFoundError: No module named 'matplotlib'
[77]:
np.quantile(temperatures,0.5)
[77]:
np.float64(17.030287339736915)
[78]:
np.percentile(a,90) #https://numpy.org/doc/stable/reference/generated/numpy.percentile.html#numpy.percentile
[78]:
np.float64(56.599999999999994)
[79]:
# https://numpy.org/doc/stable/reference/generated/numpy.histogram.html#numpy.histogram
hist, bin_edges = np.histogram(temperatures)
print(hist)
print(bin_edges)

print(np.sum(hist))

[     4    425   9761  86584 290984 379850 192169  37354   2789     80]
[-90.26009195 -69.94754344 -49.63499494 -29.32244643  -9.00989792
  11.30265058  31.61519909  51.9277476   72.2402961   92.55284461
 112.86539312]
1000000
[80]:
import matplotlib.pyplot as plt
_ = plt.hist(temperatures, bins='auto')  # arguments are passed to np.histogram
plt.title("Histogram with 'auto' bins")
plt.show()
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[80], line 1
----> 1 import matplotlib.pyplot as plt
      2 _ = plt.hist(temperatures, bins='auto')  # arguments are passed to np.histogram
      3 plt.title("Histogram with 'auto' bins")

ModuleNotFoundError: No module named 'matplotlib'
[81]:
# https://numpy.org/doc/stable/reference/generated/numpy.linspace.html

space = np.linspace(0,1,len(temperatures))
print(space[:10])
[0.000000e+00 1.000001e-06 2.000002e-06 3.000003e-06 4.000004e-06
 5.000005e-06 6.000006e-06 7.000007e-06 8.000008e-06 9.000009e-06]
[82]:
import matplotlib.pylab as plt

temperatures_sorted = np.sort(temperatures)
fig, ax = plt.subplots()
ax.plot(temperatures_sorted,space)
#CDF?
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[82], line 1
----> 1 import matplotlib.pylab as plt
      3 temperatures_sorted = np.sort(temperatures)
      4 fig, ax = plt.subplots()

ModuleNotFoundError: No module named 'matplotlib'

Gestión de alertas y errores

[83]:
a , b = 0, 3
c = b/a
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[83], line 2
      1 a , b = 0, 3
----> 2 c = b/a

ZeroDivisionError: division by zero
[84]:
a , b = 0, 3
try:
    c = b/a
except:
    print("error")
finally:
    print("Intento realizar una linea alternativa de ejecucion")

print("ESto siguye")
error
Intento realizar una linea alternativa de ejecucion
ESto siguye
[85]:
a , b = 0, 3
try:
    c = b/a
except ZeroDivisionError:
    print("error")
finally:
    print("Intento realizar una linea alternativa de ejecucion")
error
Intento realizar una linea alternativa de ejecucion
[86]:
a , b = 0, 3
d={"a":0,"b":-1}
try:
    print(d["c"])
except ZeroDivisionError:
    print("error")
finally:
    print("Intento realizar una linea alternativa de ejecucion")
Intento realizar una linea alternativa de ejecucion
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[86], line 4
      2 d={"a":0,"b":-1}
      3 try:
----> 4     print(d["c"])
      5 except ZeroDivisionError:
      6     print("error")

KeyError: 'c'
[87]:
a , b = 0, 3
d={"a":0,"b":-1}
try:
    print(d["c"])
except ZeroDivisionError:
    print("hay un cero")
except KeyError:
    print("Key no existente")
finally:
    print("Intento realizar una linea alternativa de ejecucion")
Key no existente
Intento realizar una linea alternativa de ejecucion

Docs : https://docs.python.org/3/tutorial/errors.html

[88]:
a , b = np.arange(10), np.arange(10,20)

c = b/a
print(c)
[        inf 11.          6.          4.33333333  3.5         3.
  2.66666667  2.42857143  2.25        2.11111111]
/tmp/ipykernel_3676/475367663.py:3: RuntimeWarning: divide by zero encountered in divide
  c = b/a
[89]:
import math
print(math.inf in c)
print(c * np.random.rand(10))
True
[       inf 0.42534124 1.24828272 3.89676199 0.25580228 1.26203671
 1.21949344 0.98978683 1.31024886 0.09050309]
[90]:
try:
    c = b/a
except RuntimeWarning:
    print("Capturo warning ? ") # No
/tmp/ipykernel_3676/2254159111.py:2: RuntimeWarning: divide by zero encountered in divide
  c = b/a
[91]:
# Puedo ignorarlos
import warnings
warnings.filterwarnings("ignore")
c = b/a
print(c)
[        inf 11.          6.          4.33333333  3.5         3.
  2.66666667  2.42857143  2.25        2.11111111]
[92]:
# Puedo gestionarlos como una excepción
np.seterr(all='raise')
c = b/a
print(c)
---------------------------------------------------------------------------
FloatingPointError                        Traceback (most recent call last)
Cell In[92], line 3
      1 # Puedo gestionarlos como una excepción
      2 np.seterr(all='raise')
----> 3 c = b/a
      4 print(c)

FloatingPointError: divide by zero encountered in divide
[93]:
try:
    c = b/a
except FloatingPointError:
    print("Capturo warning ? ") # Yes
Capturo warning ?

Reflexiones sobre el rendimiento computacional y algorítmico

Principales métricas percibidas por el usuario: - tiempo de respuesta, tiempo de servicio y tiempo de espera

Principales métricas para el sistema: - Productividad (trabajos/tiempo)

Las métricas están influidas porque hay una demanda sobre el servicio. Mayor demanda -> ???

El rendimiento está influido por: - Hardware: tecnología, arquitectura, - Software: sistemas operativos, lenguaje de programación, aplicaciones - Vuestra manera de programar!

[94]:
import time

start = time.time()
# do something
print("Response time: %s seconds"%(time.time()-start))
Response time: 3.457069396972656e-05 seconds
[95]:
import numpy as np
import time
serie = np.random.random(10000000)

start = time.time()
b = []
for value in serie:
    try:
        b.append(math.sqrt(value))
    except:
        b.append(0)
end1 = time.time()-start
print("Response time: %s seconds"%(end1))
Response time: 1.705350637435913 seconds
[96]:
start = time.time()
b = np.sqrt(serie)
end2 = time.time()-start
print("Response time: %s seconds"%(end2))
Response time: 0.13524341583251953 seconds
[97]:
speedup = end1/end2
print(speedup)
print("El programa 2 es %0.2f veces más rápido que el programa 1"%speedup)
12.609491019864187
El programa 2 es 12.61 veces más rápido que el programa 1

Si necesitáis 1 hora de ejecución del programa 1, con el segundo solo, 3.8199 minutos. Si necesitáis 24 horas de ejecución del programa 1, con el segundo solo, 1.527 horas.

Atención las métricas de rendimiento suelen seguir una distribución exponencial. NO SON LINEALES!!!!

[98]:
times1 = []
for size in range(10,1000000,1000):
    serie = np.random.random(size)
    start = time.time()
    b = []
    for value in serie:
        try:
            b.append(math.sqrt(value))
        except:
            b.append(0)
    end1 = time.time()-start
    times1.append(end1)

print(".")
times2 = []
for size in range(10,1000000,1000):
    serie = np.random.random(size)
    start = time.time()
    b = np.sqrt(serie)
    end2 = time.time()-start
    times2.append(end2)

.
[99]:
import matplotlib.pyplot as plt
x = list(range(len(times1)))
fig, ax = plt.subplots()
ax.plot(x, times1, label = "programa 1")
ax.plot(x, times2, label = "programa 2")
ax.legend()
plt.show()

# No es suficientemente complejo (o sí -depende de arquitectura) para crear la curva
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[99], line 1
----> 1 import matplotlib.pyplot as plt
      2 x = list(range(len(times1)))
      3 fig, ax = plt.subplots()

ModuleNotFoundError: No module named 'matplotlib'

Librería Numba

https://numba.pydata.org/

[100]:
!pip install numba
Collecting numba
  Downloading numba-0.60.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (2.7 kB)
Collecting llvmlite<0.44,>=0.43.0dev0 (from numba)
  Downloading llvmlite-0.43.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.8 kB)
Collecting numpy<2.1,>=1.22 (from numba)
  Downloading numpy-2.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
Downloading numba-0.60.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.7/3.7 MB 93.5 MB/s eta 0:00:00
Downloading llvmlite-0.43.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (43.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.9/43.9 MB 134.6 MB/s eta 0:00:00
Downloading numpy-2.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.5/19.5 MB 129.7 MB/s eta 0:00:00
Installing collected packages: numpy, llvmlite, numba
  Attempting uninstall: numpy
    Found existing installation: numpy 2.1.3
    Uninstalling numpy-2.1.3:
      Successfully uninstalled numpy-2.1.3
Successfully installed llvmlite-0.43.0 numba-0.60.0 numpy-2.0.2
[101]:
import multiprocessing

multiprocessing.cpu_count()
[101]:
2
[102]:
from numba import njit
import random


def monte_carlo_pi_sinParalelizar(nsamples):
    acc = 0
    for i in range(nsamples):
        x = random.random()
        y = random.random()
        if (x ** 2 + y ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples

@njit
def monte_carlo_pi(nsamples):
    acc = 0
    for i in range(nsamples):
        x = random.random()
        y = random.random()
        if (x ** 2 + y ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[102], line 1
----> 1 from numba import njit
      2 import random
      5 def monte_carlo_pi_sinParalelizar(nsamples):

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/numba/__init__.py:59
     54             msg = ("Numba requires SciPy version 1.0 or greater. Got SciPy "
     55                    f"{scipy.__version__}.")
     56             raise ImportError(msg)
---> 59 _ensure_critical_deps()
     60 # END DO NOT MOVE
     61 # ---------------------- WARNING WARNING WARNING ----------------------------
     64 from ._version import get_versions

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/numba/__init__.py:45, in _ensure_critical_deps()
     42 if numpy_version > (2, 0):
     43     msg = (f"Numba needs NumPy 2.0 or less. Got NumPy "
     44            f"{numpy_version[0]}.{numpy_version[1]}.")
---> 45     raise ImportError(msg)
     47 try:
     48     import scipy

ImportError: Numba needs NumPy 2.0 or less. Got NumPy 2.1.
[103]:
%timeit monte_carlo_pi_sinParalelizar(100)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[103], line 1
----> 1 get_ipython().run_line_magic('timeit', 'monte_carlo_pi_sinParalelizar(100)')

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/IPython/core/interactiveshell.py:2480, in InteractiveShell.run_line_magic(self, magic_name, line, _stack_depth)
   2478     kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2479 with self.builtin_trap:
-> 2480     result = fn(*args, **kwargs)
   2482 # The code below prevents the output from being displayed
   2483 # when using magics with decorator @output_can_be_silenced
   2484 # when the last Python token in the expression is a ';'.
   2485 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/IPython/core/magics/execution.py:1195, in ExecutionMagics.timeit(self, line, cell, local_ns)
   1193 for index in range(0, 10):
   1194     number = 10 ** index
-> 1195     time_number = timer.timeit(number)
   1196     if time_number >= 0.2:
   1197         break

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/IPython/core/magics/execution.py:173, in Timer.timeit(self, number)
    171 gc.disable()
    172 try:
--> 173     timing = self.inner(it, self.timer)
    174 finally:
    175     if gcold:

File <magic-timeit>:1, in inner(_it, _timer)

NameError: name 'monte_carlo_pi_sinParalelizar' is not defined
[104]:
%timeit monte_carlo_pi(100)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[104], line 1
----> 1 get_ipython().run_line_magic('timeit', 'monte_carlo_pi(100)')

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/IPython/core/interactiveshell.py:2480, in InteractiveShell.run_line_magic(self, magic_name, line, _stack_depth)
   2478     kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2479 with self.builtin_trap:
-> 2480     result = fn(*args, **kwargs)
   2482 # The code below prevents the output from being displayed
   2483 # when using magics with decorator @output_can_be_silenced
   2484 # when the last Python token in the expression is a ';'.
   2485 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/IPython/core/magics/execution.py:1195, in ExecutionMagics.timeit(self, line, cell, local_ns)
   1193 for index in range(0, 10):
   1194     number = 10 ** index
-> 1195     time_number = timer.timeit(number)
   1196     if time_number >= 0.2:
   1197         break

File ~/checkouts/readthedocs.org/user_builds/txadm/envs/latest/lib/python3.11/site-packages/IPython/core/magics/execution.py:173, in Timer.timeit(self, number)
    171 gc.disable()
    172 try:
--> 173     timing = self.inner(it, self.timer)
    174 finally:
    175     if gcold:

File <magic-timeit>:1, in inner(_it, _timer)

NameError: name 'monte_carlo_pi' is not defined

### Escenario de mal rendimiento …

Como traer “mal” datos de una BBDD fb096a3aab984b448d62fbd497be8bdd

Actividad Final

Con estos tres catálogos de datos: - A) https://datos.gob.es/es/catalogo/ea0010587-balears-illes-por-municipios-y-fenomeno-demografico-mnpd-identificador-api-t20-e301-fenom-a2020-l0-23007-px - B) https://data.cityofnewyork.us/Housing-Development/Speculation-Watch-List/adax-9mit - C) https://ec.europa.eu/eurostat/databrowser/view/gov_10a_exp/default/table?lang=en

Debéis analizar o aplicar 4 indicadores mínimos de libre elección. Es decir, según la naturaleza de los datos elegid indicadores estadísticos u otros. Como ejemplo, os proporciono el primero: - Con A) Series por nacidos vivos por residencia materna, Serie ordenada por fallecidos por el luegar de residencia, medía y desviación de los cinco indicadores disponibles por municipio - Con B) ? - Con C) ?

REQUISITOS - Tenéis que analizar los datos EXCLUSIVAMENTE con la librería de NUMPY - Se valorará la no inclusión de valores manuales, es decir, que el código sea robusto y lo más genérico posible. - Se valorará la inclusión de gráficas (plots).

Entrega de la práctica: - Se subirá el enlace de vuestro github personal a la Tarea específica de Auladigital. - Se ha de realizar un notebook por cada catálogo de datos. - Hay que incluir un fichero README.md que contenga el informe de la práctica (indicadores utilizados, y aspectos y conclusiones extraidas de cada uno de ellos) - Se recomienda la siguiente estructura por directorios: /part2/numpy_activity/notebookA….