Mamdani Learning Quick Start Guide¶

Overview¶

MamdaniLearning is a class for learning optimal fuzzy rules in Mamdani systems using metaheuristic optimization. Unlike Wang-Mendel (which extracts rules from data in a single pass), MamdaniLearning iteratively searches for the best rule combinations using evolutionary algorithms.

Key Features: - Optimizes rule consequents (outputs) for existing membership functions - Supports multiple metaheuristic algorithms (SA, GA, PSO, DE) - Pre-computes rule activations for efficiency - Works with any pre-configured Mamdani system - No gradient computation required

Algorithm Overview¶

What MamdaniLearning Does¶

Takes a Mamdani system with defined membership functions for inputs and outputs
Generates all possible rules from the Cartesian product of input MFs
Optimizes which output MF each rule should use as its consequent
Uses metaheuristics to search the discrete space of possible rule combinations

What It Optimizes¶

Input: Pre-defined membership functions (triangular, trapezoidal, etc.)
Output: Assignment of output terms to rules (e.g., Rule 1 → "high", Rule 2 → "low")
Objective: Minimize prediction error (RMSE) on training data

Difference from Other Methods¶

Method	What it learns	Algorithm	Speed
Wang-Mendel	Selects rules from data	One-pass extraction	Very fast
MamdaniLearning	Optimizes rule consequents	Metaheuristics	Moderate
ANFIS	Adjusts MF parameters + consequents	Gradient descent + LSE	Slow

1. Setup: Create and Configure Mamdani System¶

Before using MamdaniLearning, you must create a Mamdani system with all membership functions defined.

from fuzzy_systems.inference import MamdaniSystem
from fuzzy_systems.learning import MamdaniLearning
import numpy as np

# Create Mamdani system
system = MamdaniSystem()

# Add input variables
system.add_input('temperature', (0, 40))
system.add_input('humidity', (0, 100))

# Add output variable
system.add_output('fan_speed', (0, 100))

# Define INPUT membership functions
# Temperature: cold, warm, hot
system.add_term('temperature', 'cold', 'trapezoidal', (0, 0, 10, 20))
system.add_term('temperature', 'warm', 'triangular', (15, 25, 35))
system.add_term('temperature', 'hot', 'trapezoidal', (30, 35, 40, 40))

# Humidity: dry, normal, wet
system.add_term('humidity', 'dry', 'trapezoidal', (0, 0, 20, 40))
system.add_term('humidity', 'normal', 'triangular', (30, 50, 70))
system.add_term('humidity', 'wet', 'trapezoidal', (60, 80, 100, 100))

# Define OUTPUT membership functions
# Fan speed: low, medium, high
system.add_term('fan_speed', 'low', 'triangular', (0, 0, 50))
system.add_term('fan_speed', 'medium', 'triangular', (25, 50, 75))
system.add_term('fan_speed', 'high', 'triangular', (50, 100, 100))

Important Notes¶

All membership functions must be defined before learning
The learner will create all possible rules (Cartesian product)
For 3 MFs per input × 2 inputs = 3×3 = 9 rules total
Each rule can have any of the 3 output terms (low/medium/high)
MamdaniLearning finds the best assignment of outputs to rules

2. Instantiate MamdaniLearning Class¶

# Create learner
learner = MamdaniLearning(
    fis=system,           # Pre-configured Mamdani system
    num_points=1000,      # Discretization points for output universe
    verbose=True          # Print progress information
)

Parameters¶

fis: MamdaniSystem
Must have all input/output variables and terms defined
Rules will be created/optimized automatically
num_points: int, default=1000
Number of discretization points for output domain
Higher = more accurate defuzzification, but slower
Typical range: 500-2000
verbose: bool, default=True
Print detailed progress during optimization
Shows iteration count, cost evolution, temperature (SA), etc.

3. Training with `fit_rules()`¶

The fit_rules() method optimizes the fuzzy rules using metaheuristic algorithms:

# Prepare data
X_train = ...  # shape: (n_samples, n_features)
y_train = ...  # shape: (n_samples,)

# Train with Simulated Annealing (default)
learner.fit_rules(
    X_train=X_train,
    y_train=y_train,
    optimizer='sa',                      # Algorithm: 'sa', 'ga', 'pso', 'de'
    optimizer_params=None,               # Optional: algorithm-specific parameters
    initial_solution_method='random'     # Initialization: 'random', 'uniform', 'gradient'
)

Parameters¶

X_train: np.ndarray, shape (n_samples, n_features)
Training input data
Must match number of input variables in system
y_train: np.ndarray, shape (n_samples,)
Training target values
Currently supports single-output systems
optimizer: str, default='sa'
Optimization algorithm:
- 'sa': Simulated Annealing (recommended for small-medium problems)
- 'ga': Genetic Algorithm (good for discrete optimization)
- 'pso': Particle Swarm Optimization (fast convergence)
- 'de': Differential Evolution (robust)
optimizer_params: dict, optional
Algorithm-specific hyperparameters (see section 4)
initial_solution_method: str, default='random'
How to initialize the rule consequents:
- 'random': Random assignment of output terms
- 'uniform': All rules start with middle output term
- 'gradient': Data-driven initialization (recommended for faster convergence)

Return Value¶

Returns self for method chaining:

learner = MamdaniLearning(system).fit_rules(X_train, y_train)

4. Optimizer-Specific Parameters¶

4.1 Simulated Annealing (`optimizer='sa'`)¶

Best for: Small-to-medium rule bases, guarantees convergence

learner.fit_rules(
    X_train, y_train,
    optimizer='sa',
    optimizer_params={
        'temperature_init': 100.0,      # Initial temperature
        'temperature_min': 0.01,        # Minimum temperature (stopping criterion)
        'cooling_rate': 0.95,           # Temperature decay (0.9-0.99)
        'max_iterations': 5000,         # Maximum iterations
        'plateau_iterations': 1000,     # Stop if no improvement for N iterations
        'cooling_schedule': 'exponential'  # 'exponential', 'linear', or 'logarithmic'
    }
)

Parameters: - temperature_init: Starting temperature (higher = more exploration) - cooling_rate: Decay factor per iteration (closer to 1 = slower cooling) - cooling_schedule: How temperature decreases - 'exponential': T *= cooling_rate (default, balanced) - 'linear': T -= (T_init - T_min) / max_iter (fast cooling) - 'logarithmic': T = T_init / log(1 + iteration) (slow cooling)

Tips: - Start with high temperature (50-200) for good exploration - Use cooling_rate 0.90-0.99 (0.95 is good default) - Increase max_iterations if solution hasn't converged

4.2 Genetic Algorithm (`optimizer='ga'`)¶

Best for: Large rule bases, diverse solution exploration

learner.fit_rules(
    X_train, y_train,
    optimizer='ga',
    optimizer_params={
        'pop_size': 100,                # Population size
        'max_gen': 500,                 # Maximum generations
        'elite_ratio': 0.15,            # Fraction of elites to keep (0.1-0.2)
        'crossover_rate': 0.8,          # Probability of crossover (0.7-0.9)
        'crossover_type': 'uniform',    # 'uniform' or 'single_point'
        'mutation_rate': 0.05,          # Probability of mutation (0.01-0.1)
        'tournament_size': 5,           # Tournament selection size (3-7)
        'adaptive_mutation': True,      # Increase mutation when stagnant
        'plateau_generations': 50,      # Trigger adaptive mutation after N gens
        'mutation_boost_factor': 2.0    # Mutation rate multiplier when stagnant
    }
)

Parameters: - pop_size: Number of individuals (50-200, larger for complex problems) - elite_ratio: Fraction of best individuals preserved (typical: 0.1-0.2) - crossover_rate: Higher = more recombination (0.7-0.9) - mutation_rate: Higher = more exploration (0.01-0.1) - adaptive_mutation: Automatically increases mutation when stuck

Tips: - Use crossover_type='uniform' for better exploration - Enable adaptive_mutation=True to escape local optima - Balance: high crossover (0.8) + low mutation (0.05)

4.3 Particle Swarm Optimization (`optimizer='pso'`)¶

Best for: Fast convergence, continuous-like exploration

learner.fit_rules(
    X_train, y_train,
    optimizer='pso',
    optimizer_params={
        'n_particles': 30,          # Number of particles (20-50)
        'n_iterations': 100,        # Number of iterations (50-200)
        'w_max': 0.9,               # Initial inertia weight (exploration)
        'w_min': 0.4,               # Final inertia weight (exploitation)
        'c1': 1.49618,              # Cognitive parameter (personal best attraction)
        'c2': 1.49618               # Social parameter (global best attraction)
    }
)

Parameters: - w_max/w_min: Inertia weight (linearly decreases from w_max to w_min) - High w = exploration, low w = exploitation - c1: Attraction to personal best (typical: 1.5-2.0) - c2: Attraction to global best (typical: 1.5-2.0)

Tips: - Use c1 ≈ c2 for balanced exploration/exploitation - Increase c1 for more individual exploration - Increase c2 for faster convergence (may get stuck)

4.4 Differential Evolution (`optimizer='de'`)¶

Best for: Robust global optimization, fewer parameters

learner.fit_rules(
    X_train, y_train,
    optimizer='de',
    optimizer_params={
        'pop_size': 50,             # Population size (30-100)
        'max_iter': 100,            # Maximum iterations (50-200)
        'F': 0.8,                   # Differential weight (0.5-1.0)
        'CR': 0.9                   # Crossover probability (0.7-0.95)
    }
)

Parameters: - F: Differential weight (controls mutation scale) - Low F (0.5-0.7) = conservative - High F (0.8-1.0) = aggressive exploration - CR: Crossover rate - High CR (0.8-0.95) = more information exchange

Tips: - Use F=0.8, CR=0.9 as good defaults - Increase pop_size for complex problems - More stable than GA, fewer hyperparameters

5. Making Predictions¶

After training, use the learned rules to make predictions:

# Make predictions
X_test = ...  # shape: (n_samples, n_features)
y_pred = learner.predict(X_test)

Prediction Methods¶

# Basic prediction
y_pred = learner.predict(X_test)

# Calculate RMSE on test set
rmse = learner.score(X_test, y_test)
print(f"Test RMSE: {rmse:.4f}")

6. Accessing Learned Rules and History¶

6.1 Get Learned Rules¶

# Get best rule consequent indices
rules = learner.get_rules()
print(f"Optimized rules: {rules}")
# Example output: [0, 2, 1, 0, 2, 1, 0, 1, 2]
# This means: Rule 0 uses output term 0 (low)
#             Rule 1 uses output term 2 (high)
#             Rule 2 uses output term 1 (medium), etc.

6.2 Get Best Cost¶

# Get final optimization cost (RMSE)
best_cost = learner.get_cost()
print(f"Best RMSE: {best_cost:.6f}")

6.3 Get Optimization History¶

# Get convergence history
history = learner.get_history()

# Plot convergence
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 5))
plt.plot(history['costs'])
plt.xlabel('Iteration')
plt.ylabel('RMSE')
plt.title('Optimization Convergence')
plt.grid(True)
plt.show()

History dictionary contains:

{
    'costs': list,           # Cost at each iteration
    'temperatures': list,    # Temperature schedule (SA only)
    'acceptances': list,     # Acceptance rate (SA only)
    'best_fitnesses': list,  # Best fitness per generation (GA only)
    'avg_fitnesses': list,   # Average fitness per generation (GA only)
    # ... optimizer-specific metrics
}

7. Complete Example¶

import numpy as np
from fuzzy_systems.inference import MamdaniSystem
from fuzzy_systems.learning import MamdaniLearning
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

# ============================================================================
# 1. Generate synthetic data
# ============================================================================
np.random.seed(42)
X = np.random.uniform(0, 10, (500, 2))
y = np.sin(X[:, 0]) * np.cos(X[:, 1]) + np.random.normal(0, 0.1, 500)

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# ============================================================================
# 2. Create and configure Mamdani system
# ============================================================================
system = MamdaniSystem()

# Inputs
system.add_input('x1', (0, 10))
system.add_input('x2', (0, 10))

# Add 3 terms per input
for var in ['x1', 'x2']:
    system.add_term(var, 'low', 'triangular', (0, 0, 5))
    system.add_term(var, 'medium', 'triangular', (2.5, 5, 7.5))
    system.add_term(var, 'high', 'triangular', (5, 10, 10))

# Output
system.add_output('y', (-2, 2))
system.add_term('y', 'negative', 'triangular', (-2, -2, 0))
system.add_term('y', 'zero', 'triangular', (-1, 0, 1))
system.add_term('y', 'positive', 'triangular', (0, 2, 2))

print(f"Total possible rules: 3 × 3 = 9")
print(f"Each rule can have 3 different consequents (negative/zero/positive)")

# ============================================================================
# 3. Create learner and optimize rules
# ============================================================================
learner = MamdaniLearning(system, num_points=1000, verbose=True)

# Train with Simulated Annealing
learner.fit_rules(
    X_train, y_train,
    optimizer='sa',
    optimizer_params={
        'temperature_init': 100.0,
        'cooling_rate': 0.95,
        'max_iterations': 3000,
        'plateau_iterations': 500
    },
    initial_solution_method='gradient'
)

# ============================================================================
# 4. Evaluate
# ============================================================================
y_pred_train = learner.predict(X_train)
y_pred_test = learner.predict(X_test)

rmse_train = learner.score(X_train, y_train)
rmse_test = learner.score(X_test, y_test)

print(f"\n{'='*70}")
print("RESULTS")
print(f"{'='*70}")
print(f"Train RMSE: {rmse_train:.6f}")
print(f"Test RMSE:  {rmse_test:.6f}")
print(f"Best rules: {learner.get_rules()}")

# ============================================================================
# 5. Plot convergence
# ============================================================================
history = learner.get_history()

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

# Cost evolution
ax1.plot(history['costs'])
ax1.set_xlabel('Iteration')
ax1.set_ylabel('RMSE')
ax1.set_title('Cost Evolution')
ax1.grid(True)

# Temperature evolution (SA only)
if 'temperatures' in history:
    ax2.plot(history['temperatures'])
    ax2.set_xlabel('Iteration')
    ax2.set_ylabel('Temperature')
    ax2.set_title('Temperature Schedule')
    ax2.grid(True)

plt.tight_layout()
plt.show()

# ============================================================================
# 6. Plot predictions
# ============================================================================
plt.figure(figsize=(10, 5))
plt.scatter(y_test, y_pred_test, alpha=0.6)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--', lw=2)
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.title(f'Test Set Predictions (RMSE: {rmse_test:.4f})')
plt.grid(True)
plt.show()

8. Choosing the Right Optimizer¶

Decision Guide¶

┌─────────────────────────────────────┐
│  How many rules in your system?     │
└──────────────┬──────────────────────┘
               │
       ┌───────┴───────┐
       │               │
    < 20 rules     > 20 rules
       │               │
       ▼               ▼
  Use SA/PSO      Use GA/DE
  (faster)        (better exploration)
       │               │
       └───────┬───────┘
               │
               ▼
┌─────────────────────────────────────┐
│  Do you need guaranteed convergence?│
└──────────────┬──────────────────────┘
               │
       ┌───────┴───────┐
       │               │
      YES              NO
       │               │
       ▼               ▼
     Use SA      Use GA/PSO/DE
   (proven)       (faster, stochastic)

Recommendations by Problem Size¶

Rules	Best Algorithm	Typical Time	Parameters
< 10	SA	Fast (seconds)	temperature_init=100, cooling_rate=0.95
10-30	SA or PSO	Medium (minutes)	SA: cooling_rate=0.97, PSO: n_particles=30
30-100	GA or DE	Slow (minutes-hours)	GA: pop_size=100, DE: pop_size=50
> 100	GA	Very slow	pop_size=200, max_gen=1000

Algorithm Characteristics¶

Simulated Annealing (SA): - ✅ Theoretical convergence guarantee - ✅ Few hyperparameters - ✅ Good for small-medium problems - ❌ Slow for large problems - ❌ Sequential (no parallelization)

Genetic Algorithm (GA): - ✅ Excellent exploration - ✅ Handles large search spaces - ✅ Can be parallelized - ❌ Many hyperparameters to tune - ❌ Can be slow to converge

Particle Swarm Optimization (PSO): - ✅ Fast convergence - ✅ Few parameters - ✅ Good balance exploration/exploitation - ❌ Can get stuck in local optima - ❌ Adapted for discrete spaces (may be suboptimal)

Differential Evolution (DE): - ✅ Robust and reliable - ✅ Very few parameters - ✅ Good for difficult landscapes - ❌ Slower than PSO - ❌ Adapted for discrete spaces

9. Tips and Best Practices¶

Membership Function Design¶

# Good: 3-5 MFs with 50% overlap
system.add_term('temp', 'low', 'triangular', (0, 0, 5))
system.add_term('temp', 'med', 'triangular', (2.5, 5, 7.5))
system.add_term('temp', 'high', 'triangular', (5, 10, 10))

# Bad: Too many MFs = exponential rule growth
# 5 MFs × 5 MFs × 5 MFs = 125 rules!

Initial Solution¶

# Random: Good default
learner.fit_rules(X, y, initial_solution_method='random')

# Gradient: Best for faster convergence (uses data)
learner.fit_rules(X, y, initial_solution_method='gradient')

# Uniform: All rules start with middle term (conservative)
learner.fit_rules(X, y, initial_solution_method='uniform')

Monitoring Convergence¶

# Check if optimization converged
history = learner.get_history()
costs = history['costs']

# Plateau detection
last_100 = costs[-100:]
if max(last_100) - min(last_100) < 1e-4:
    print("✓ Converged (cost plateau)")
else:
    print("⚠️ Not converged, increase max_iterations")

Overfitting Prevention¶

# Use validation set for early stopping (manual)
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2)

learner.fit_rules(X_train, y_train, optimizer='sa')

rmse_train = learner.score(X_train, y_train)
rmse_val = learner.score(X_val, y_val)

if rmse_val > rmse_train * 1.5:
    print("⚠️ Overfitting detected")

Computational Efficiency¶

# Pre-computation is automatic, but you can control discretization
learner = MamdaniLearning(
    system,
    num_points=500    # Lower = faster, less accurate defuzzification
)

# For very large datasets, consider sampling
if len(X_train) > 1000:
    indices = np.random.choice(len(X_train), 1000, replace=False)
    X_sample = X_train[indices]
    y_sample = y_train[indices]
    learner.fit_rules(X_sample, y_sample)

10. Troubleshooting¶

Problem	Possible Cause	Solution
Slow convergence	Too many rules	Reduce MFs per variable
Stuck in local optimum	SA cooling too fast	Decrease cooling_rate (0.95→0.98)
Poor accuracy	Bad MF placement	Adjust MF centers/shapes
Overfitting	Too many rules	Reduce MFs, add regularization
No improvement	Bad initialization	Try `initial_solution_method='gradient'`
NaN values	Defuzzification issues	Check MF coverage, increase num_points

11. Comparison with Other Methods¶

vs Wang-Mendel¶

Wang-Mendel: - ✅ Very fast (single pass) - ✅ No hyperparameters - ❌ No optimization (may be suboptimal)

MamdaniLearning: - ✅ Optimizes for best accuracy - ✅ Flexible (multiple algorithms) - ❌ Slower (iterative)

When to use which: - Use Wang-Mendel for quick baseline - Use MamdaniLearning to improve accuracy

vs ANFIS¶

ANFIS: - ✅ Learns MF parameters + consequents - ✅ Gradient-based (efficient for continuous) - ❌ Complex implementation - ❌ Requires differentiable MFs

MamdaniLearning: - ✅ Simpler (only learns consequents) - ✅ Works with any MF type - ✅ Interpretable rules - ❌ Doesn't adjust MF shapes

When to use which: - Use MamdaniLearning for interpretability, fixed MFs - Use ANFIS for maximum accuracy, flexible MFs

12. Advanced Usage¶

Sequential Optimization¶

# Step 1: Quick initialization with Wang-Mendel
from fuzzy_systems.learning import WangMendelLearning
wm = WangMendelLearning(system, X_train, y_train)
wm.fit()

# Step 2: Fine-tune with MamdaniLearning
learner = MamdaniLearning(system)
learner.fit_rules(X_train, y_train, optimizer='pso',
                 initial_solution_method='random')

Comparing Multiple Optimizers¶

results = {}

for opt in ['sa', 'ga', 'pso', 'de']:
    learner = MamdaniLearning(system, verbose=False)
    learner.fit_rules(X_train, y_train, optimizer=opt)
    rmse = learner.score(X_test, y_test)
    results[opt] = rmse
    print(f"{opt.upper()}: RMSE = {rmse:.6f}")

best_opt = min(results, key=results.get)
print(f"\nBest optimizer: {best_opt.upper()}")

References¶

Mamdani, E. H., & Assilian, S. (1975). "An experiment in linguistic synthesis with a fuzzy logic controller." International Journal of Man-Machine Studies, 7(1), 1-13.
Kirkpatrick, S., et al. (1983). "Optimization by simulated annealing." Science, 220(4598), 671-680.
Holland, J. H. (1992). "Genetic algorithms." Scientific American, 267(1), 66-73.
Kennedy, J., & Eberhart, R. (1995). "Particle swarm optimization." IEEE International Conference on Neural Networks.
Storn, R., & Price, K. (1997). "Differential evolution–a simple and efficient heuristic for global optimization." Journal of Global Optimization, 11(4), 341-359.

Mamdani Learning Quick Start Guide¶

Overview¶

Algorithm Overview¶

What MamdaniLearning Does¶

What It Optimizes¶

Difference from Other Methods¶

1. Setup: Create and Configure Mamdani System¶

Important Notes¶

2. Instantiate MamdaniLearning Class¶

Parameters¶

3. Training with fit_rules()¶

Parameters¶

Return Value¶

4. Optimizer-Specific Parameters¶

4.1 Simulated Annealing (optimizer='sa')¶

4.2 Genetic Algorithm (optimizer='ga')¶

4.3 Particle Swarm Optimization (optimizer='pso')¶

4.4 Differential Evolution (optimizer='de')¶

5. Making Predictions¶

Prediction Methods¶

6. Accessing Learned Rules and History¶

6.1 Get Learned Rules¶

6.2 Get Best Cost¶

6.3 Get Optimization History¶

7. Complete Example¶

8. Choosing the Right Optimizer¶

Decision Guide¶

Recommendations by Problem Size¶

Algorithm Characteristics¶

9. Tips and Best Practices¶

Membership Function Design¶

Initial Solution¶

Monitoring Convergence¶

Overfitting Prevention¶

Computational Efficiency¶

10. Troubleshooting¶

11. Comparison with Other Methods¶

vs Wang-Mendel¶

vs ANFIS¶

12. Advanced Usage¶

Sequential Optimization¶

Comparing Multiple Optimizers¶

References¶

3. Training with `fit_rules()`¶

4.1 Simulated Annealing (`optimizer='sa'`)¶

4.2 Genetic Algorithm (`optimizer='ga'`)¶

4.3 Particle Swarm Optimization (`optimizer='pso'`)¶

4.4 Differential Evolution (`optimizer='de'`)¶