This directory contains comprehensive examples demonstrating how to use the torchTextClassifiers package for various text classification tasks.
A simple binary sentiment classification example that covers:
- Creating a FastText classifier
- Preparing training and validation data
- Building and training the model
- Making predictions and evaluating performance
- Saving model configuration
Run the example:
cd examples
uv run --extra huggingface python basic_classification.pyWhat you'll learn:
- Basic API usage
- Binary classification workflow
- Model evaluation
- Configuration persistence
Demonstrates 3-class sentiment analysis (positive, negative, neutral):
- Multi-class data preparation
- Class distribution handling
- Detailed result analysis
- Configuration loading and validation
Run the example:
cd examples
uv run --extra huggingface python multiclass_classification.pyWhat you'll learn:
- Multi-class classification setup
- Class imbalance considerations
- Advanced result interpretation
- Model serialization/deserialization
Shows how to combine text and categorical features:
- Text + categorical data preparation
- Feature engineering for categorical variables
- Comparing mixed vs. text-only models
- Performance analysis with different feature types
Run the example:
cd examples
uv run --extra huggingface python Using_additional_features.pyWhat you'll learn:
- Mixed feature classification
- Categorical feature configuration
- Feature importance analysis
- Model comparison techniques
Explores advanced training configurations:
- Custom PyTorch Lightning trainer parameters
- Different hardware configurations (CPU/GPU)
- Training optimization techniques
- Model comparison and selection
Run the example:
cd examples
uv run --extra huggingface python advanced_training.pyWhat you'll learn:
- Advanced training configurations
- Hardware-specific optimizations
- Training parameter tuning
- Model performance comparison
Demonstrates model explainability with ASCII histogram visualizations:
- Training a FastText classifier with enhanced data
- Word-level contribution analysis
- ASCII histogram visualization in terminal
- Interactive mode for custom text analysis
- Real-time prediction explanations
Run the example:
cd examples
# Regular mode - analyze predefined examples
uv run --extra huggingface python simple_explainability_example.py
# Interactive mode - analyze your own text
uv run --extra huggingface python simple_explainability_example.py --interactiveWhat you'll learn:
- Model explainability and interpretation
- Word importance analysis
- Interactive prediction tools
- ASCII-based data visualization
- Real-time model analysis
To run any example:
- Install dependencies:
uv sync- Navigate to examples directory:
cd examples- Run an example:
uv --extra huggingface run python basic_classification.py🚀 Basic Text Classification Example
==================================================
📝 Creating sample data...
Training samples: 10
Validation samples: 2
Test samples: 3
🏗️ Creating FastText classifier...
🔨 Building model...
✅ Model built successfully!
🎯 Training model...
✅ Training completed!
🔮 Making predictions...
Predictions: [1 0 1]
True labels: [1 0 1]
Test accuracy: 1.000
📊 Detailed Results:
----------------------------------------
1. ✅ Predicted: Positive
Text: This is an amazing product with great features!...
2. ✅ Predicted: Negative
Text: Completely disappointed with this purchase...
3. ✅ Predicted: Positive
Text: Excellent build quality and works as expected...
💾 Saving model configuration...
✅ Configuration saved to 'basic_classifier_config.json'
🎉 Example completed successfully!
🎭 Multi-class Text Classification Example
==================================================
📝 Creating multi-class sentiment data...
Training samples: 15
Class distribution: Negative=5, Neutral=5, Positive=5
🏗️ Creating multi-class FastText classifier...
🔨 Building model...
✅ Model built successfully!
🎯 Training model...
✅ Training completed!
📊 Detailed Results:
------------------------------------------------------------
1. ✅ Predicted: Negative, True: Negative
Text: This is absolutely horrible!
2. ✅ Predicted: Neutral, True: Neutral
Text: It's an average product, nothing more.
3. ✅ Predicted: Positive, True: Positive
Text: Fantastic! Love every aspect of it!
Final Accuracy: 3/3 = 1.000
🔍 Simple Explainability Example
🔍 Testing explainability on 5 examples:
============================================================
📝 Example 1:
Text: 'This product is amazing!'
Prediction: Positive
📊 Word Contribution Histogram:
--------------------------------------------------
This | ██████████████████████████████ 0.3549
product | █████████████ 0.1651
is | ████████████████████████ 0.2844
amazing! | ████████████████ 0.1956
--------------------------------------------------
✅ Analysis completed for example 1
📝 Example 2:
Text: 'Poor quality and terrible service'
Prediction: Negative
⚠️ Explainability failed:
✅ Analysis completed for example 2
📝 Example 3:
Text: 'Great value for money'
Prediction: Positive
📊 Word Contribution Histogram:
--------------------------------------------------
Great | ██████████████████████████████ 0.3287
value | ████████████████████ 0.2220
for | ██████████████████████████ 0.2929
money | ██████████████ 0.1564
--------------------------------------------------
✅ Analysis completed for example 3
📝 Example 4:
Text: 'Completely disappointing and awful experience'
Prediction: Negative
📊 Word Contribution Histogram:
--------------------------------------------------
Completely | ██████████ 0.1673
disappointing | ██████████████████████████████ 0.4676
and | █████ 0.0910
awful | ███████ 0.1225
experience | █████████ 0.1516
--------------------------------------------------
✅ Analysis completed for example 4
📝 Example 5:
Text: 'Love this excellent design'
Prediction: Positive
📊 Word Contribution Histogram:
--------------------------------------------------
Love | ██████████████████ 0.2330
this | ████████████████████ 0.2525
excellent | ██████████████████████████████ 0.3698
design | ███████████ 0.1447
--------------------------------------------------
✅ Analysis completed for example 5
🎉 Explainability analysis completed for 5 examples!
💡 Tip: Use --interactive flag to enter interactive mode for custom text analysis!
Example: uv run python examples/simple_explainability_example.py --interactive
============================================================
🎯 Interactive Explainability Mode
============================================================
Enter your own text to see predictions and explanations!
Type 'quit' or 'exit' to end the session.
💬 Enter text: Amazing product quality!
🔍 Analyzing: 'Amazing product quality!'
🎯 Prediction: Positive
📊 Word Contribution Histogram:
--------------------------------------------------
Amazing | ██████████████████████████████ 0.5429
product | ██████████████ 0.2685
quality! | ██████████ 0.1886
--------------------------------------------------
💡 Most influential word: 'Amazing' (score: 0.5429)
--------------------------------------------------
💬 Enter text: Terrible customer support
🔍 Analyzing: 'Terrible customer support'
🎯 Prediction: Negative
📊 Word Contribution Histogram:
--------------------------------------------------
Terrible | ██████████████████████████████ 0.5238
customer | ███████████ 0.1988
support | ███████████████ 0.2774
--------------------------------------------------
💡 Most influential word: 'Terrible' (score: 0.5238)
--------------------------------------------------
💬 Enter text: quit
👋 Thanks for using the explainability tool!
You can easily adapt the examples to your own data:
# Replace the example data with your own
X_train = np.array([
"Your text sample 1",
"Your text sample 2",
# ... more samples
])
y_train = np.array([0, 1, ...]) # Your labelsExperiment with different model parameters:
classifier = create_fasttext(
embedding_dim=200, # Increase for better representations
num_tokens=20000, # Increase for larger vocabularies
min_count=3, # Increase to filter rare words
num_epochs=100, # Increase for more training
batch_size=64, # Adjust based on your hardware
)Extend examples with your own categorical features:
# Add your categorical features
categorical_features = np.array([
[category1, category2, category3],
# ... more feature vectors
])
X_mixed = np.column_stack([text_data, categorical_features])- Increase embedding dimensions for complex tasks
- Use more training data when available
- Tune n-gram parameters (min_n, max_n) for your domain
- Experiment with batch sizes and learning rates
- Consider mixed features if you have structured data
- Use sparse embeddings for large vocabularies
- Increase batch size (if memory allows)
- Reduce embedding dimensions for faster convergence
- Use CPU training for small datasets
- Adjust num_workers for optimal data loading
- Use gradient accumulation with small batch sizes
- Enable mixed precision training (precision=16)
- Implement data streaming for very large datasets
- Use multiple GPUs if available
-
Memory errors:
- Reduce batch_size
- Use sparse=True
- Reduce embedding_dim
-
Slow training:
- Increase batch_size
- Reduce num_workers
- Use CPU for small datasets
-
Poor accuracy:
- Increase training data
- Tune hyperparameters
- Check data quality
- Increase num_epochs
-
Import errors:
- Run
uv syncto install dependencies - Check Python version compatibility
- Run
If you encounter issues:
- Check the main README for setup instructions
- Review the API documentation
- Look at similar examples for reference
- Open an issue on GitHub with your specific problem
- Main README - Package overview and installation
- API Reference - Complete API documentation
- Developer Guide - Adding new classifier types
- Tests - Unit and integration tests for reference
We welcome new examples! If you have a use case that would benefit others:
- Follow the existing example structure
- Include comprehensive comments
- Add error handling and validation
- Test your example thoroughly
- Update this README with your addition
Example template structure:
"""
Your Example Title
Brief description of what this example demonstrates.
"""
import numpy as np
from torchTextClassifiers import create_fasttext
def main():
print("🚀 Your Example Title")
print("=" * 50)
# Your implementation here
print("🎉 Example completed successfully!")
if __name__ == "__main__":
main()