1. Basics of AI/ML
What is AI?
- Artificial Intelligence (AI): Simulates human intelligence in machines to perform tasks like learning, reasoning, and problem-solving.
- Machine Learning (ML): A subset of AI that uses algorithms to learn patterns from data and make predictions or decisions.
Types of ML:
- Supervised Learning: Learn from labeled data (e.g., predicting house prices).
- Unsupervised Learning: Discover patterns in unlabeled data (e.g., customer segmentation).
- Reinforcement Learning: Learn by interacting with an environment to maximize a reward (e.g., game-playing AI).
- Semi-Supervised Learning: Combines small amounts of labeled data with large amounts of unlabeled data.
Common Algorithms:
- Supervised: Linear Regression, Decision Trees, Support Vector Machines (SVM), Neural Networks.
- Unsupervised: K-Means Clustering, Principal Component Analysis (PCA).
- Reinforcement: Q-Learning, Deep Q-Networks (DQN).
2. Examples of AI/ML in Action
Supervised Learning (Regression Example):
- Predict house prices based on features like size and location.
```python
from sklearn.linear_model import LinearRegression
Training data
X = [[1200], [1500], [2000]] House sizes (sq. ft.)
y = [200000, 250000, 320000] Prices
Train the model
model = LinearRegression()
model.fit(X, y)
Predict price for a 1800 sq. ft. house
predicted_price = model.predict([[1800]])
print(f"Predicted Price: ${predicted_price[0]:,.2f}")
```
Unsupervised Learning (Clustering Example):
- Segment customers based on purchase behavior.
```python
from sklearn.cluster import KMeans
Customer data
data = [[5, 500], [10, 1000], [15, 1500], [30, 3000]] [Visits, Spend]
Cluster into 2 groups
kmeans = KMeans(n_clusters=2)
kmeans.fit(data)
print("Cluster Centers:", kmeans.cluster_centers_)
```
Reinforcement Learning Example:
- Train an AI agent to balance a cartpole in OpenAI Gym.
```python
import gym
env = gym.make('CartPole-v1')
state = env.reset()
for _ in range(1000):
env.render()
action = env.action_space.sample() Random action
next_state, reward, done, _ = env.step(action)
if done:
break
env.close()
```
3. Mathematical Foundations of ML
Key Formulas:
- Linear Regression:
[
y = wX + b
]
-
( w ): Weight, ( b ): Bias, ( X ): Input.
-
Logistic Regression (Classification):
[
P(y=1|X) = \frac{1}{1 + e^{-(wX + b)}}
]
-
Gradient Descent:
- Update rule for minimizing the loss function:
[
w = w - \eta \frac{\partial L}{\partial w}
]
-
( \eta ): Learning rate, ( L ): Loss function.
-
Support Vector Machine (SVM):
- Decision boundary:
[
f(X) = w^T X + b
]
-
Maximize margin between classes.
-
Neural Networks:
- Forward pass for a single neuron:
[
y = \sigma(wX + b)
]
-
( \sigma ): Activation function (e.g., ReLU, Sigmoid).
-
K-Means Clustering:
- Update cluster centroids iteratively:
[
\mu_k = \frac{1}{N_k} \sum_{i \in N_k} x_i
]
- ( \mu_k ): Centroid, ( N_k ): Points in cluster ( k ).
4. Specific Situations for AI/ML Applications
Scenario 1: Predicting Customer Churn
- Use classification algorithms to predict if a customer will leave based on features like usage, complaints, or subscription age.
```python
from sklearn.ensemble import RandomForestClassifier
Features: [Monthly Charges, Tenure]
X = [[70, 12], [120, 36], [50, 3], [90, 24]]
y = [0, 0, 1, 0] 0 = No churn, 1 = Churn
clf = RandomForestClassifier()
clf.fit(X, y)
prediction = clf.predict([[80, 18]]) New customer
print("Will the customer churn?", "Yes" if prediction[0] else "No")
```
Scenario 2: Fraud Detection
- Use anomaly detection to spot fraudulent transactions.
```python
from sklearn.ensemble import IsolationForest
Transactions: [Amount, Frequency]
data = [[100, 1], [200, 2], [5000, 10], [100, 1]]
clf = IsolationForest(contamination=0.25)
clf.fit(data)
outliers = clf.predict(data) -1 indicates fraud
print("Fraudulent Transactions:", [data[i] for i in range(len(data)) if outliers[i] == -1])
```
Scenario 3: Image Recognition?
- Identify objects in images using Convolutional Neural Networks (CNNs).
```python
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
model = Sequential([
Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)),
MaxPooling2D((2, 2)),
Flatten(),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])
model.summary() Display architecture
```
Scenario 4: Natural Language Processing (NLP)?
- Perform sentiment analysis on product reviews.
```python
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
result = classifier("This product is amazing!")
print(result)
```
Scenario 5: Self-Driving Cars
- Use Reinforcement Learning to train a car to navigate traffic. Frameworks like CARLA provide simulation environments.
5. Best Practices for AI/ML
- Data Preprocessing:
-
Clean, normalize, and scale data.
python
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_data = scaler.fit_transform(data)
-
Model Evaluation:
-
Use metrics like accuracy, precision, recall, and F1-score.
python
from sklearn.metrics import classification_report
print(classification_report(y_true, y_pred))
-
Avoid Overfitting:
-
Use techniques like cross-validation, dropout (in neural networks), and regularization.
python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
-
Hyperparameter Tuning:
- Optimize parameters using Grid Search or Random Search.
python
from sklearn.model_selection import GridSearchCV
grid = GridSearchCV(estimator, param_grid, scoring='accuracy')
grid.fit(X, y)
6. Resources to Learn AI/ML
- Books: "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow" by Aurélien Géron.
- Courses: Andrew Ng’s ML Course, Fast.ai.
- Tools: Scikit-learn, TensorFlow, PyTorch, Hugging Face.