NET Vocal Remover API

Advanced AI-powered audio separation technology for splitting music tracks into vocals and instrumental components. Utilizes state-of-the-art ONNX neural network models with STFT-based processing and intelligent noise reduction.

Professional Audio Separation The Vocal Remover API provides sophisticated AI-based source separation using deep learning models, with support for GPU acceleration, chunked processing, and multiple quality presets.

Overview

The Vocal Remover system enables professional-grade audio source separation by leveraging advanced AI models:

AI-Powered Separation: Uses ONNX neural network models for accurate vocal/instrumental separation
Multiple Model Options: Choose between Default, Best, and Karaoke models for different use cases
GPU Acceleration: Automatic CUDA support for faster processing when available
Chunked Processing: Handles long audio files with configurable chunk sizes and overlap margins
Noise Reduction: Optional advanced denoising for cleaner results
Progress Tracking: Real-time progress updates with detailed status information

Key Features

AI Model Selection

Three pre-trained models optimized for different scenarios: Default for balanced quality and speed, Best for maximum quality, and Karaoke for preserving background vocals.

STFT Processing

Short-Time Fourier Transform based processing with configurable FFT size, Hanning window, and frequency/time dimensions for precise spectral analysis.

Smart Chunking

Intelligent audio segmentation with overlapping margins for seamless processing of files of any length without memory constraints.

Hardware Acceleration

Automatic detection and utilization of CUDA-enabled GPUs for significantly faster processing, with CPU fallback for compatibility.

SimpleAudioSeparationService Class

The SimpleAudioSeparationService class provides the core functionality for audio separation.

AudioSeparationService Namespace

using OwnaudioNET.Features.Vocalremover;

// Create service with custom options
var options = new SimpleSeparationOptions
{
    Model = InternalModel.Best,
    OutputDirectory = "output",
    ChunkSizeSeconds = 15,
    DisableNoiseReduction = false
};

var service = new SimpleAudioSeparationService(options);
service.Initialize();

// Separate audio file
SimpleSeparationResult result = service.Separate("input.mp3");

Console.WriteLine($"Vocals: {result.VocalsPath}");
Console.WriteLine($"Instrumental: {result.InstrumentalPath}");
Console.WriteLine($"Processing time: {result.ProcessingTime}");

Public API Methods

Method	Return	Description
`Initialize()`	void	Initializes the ONNX model session with GPU or CPU execution provider
`Separate(string)`	SimpleSeparationResult	Separates audio file into vocals and instrumental tracks
`Dispose()`	void	Releases all resources including ONNX session

Events

Event	Type	Description
`ProgressChanged`	EventHandler<SimpleSeparationProgress>	Raised when processing progress updates
`ProcessingCompleted`	EventHandler<SimpleSeparationResult>	Raised when separation completes successfully

Data Classes

SimpleSeparationOptions

Configuration parameters for the audio separation process.

SimpleSeparationOptions Class

public class SimpleSeparationOptions
{
    // ONNX model file path (optional if using internal models)
    public string? ModelPath { get; set; }

    // Internal model selection (Default, Best, Karaoke)
    public InternalModel Model { get; set; } = InternalModel.Best;

    // Output directory path
    public string OutputDirectory { get; set; } = "separated";

    // Disable noise reduction (enabled by default)
    public bool DisableNoiseReduction { get; set; } = false;

    // Margin size for overlapping chunks (in samples, default: 44100 = 1 second)
    public int Margin { get; set; } = 44100;

    // Chunk size in seconds (0 = process entire file at once)
    public int ChunkSizeSeconds { get; set; } = 15;

    // FFT size for STFT processing
    public int NFft { get; set; } = 6144;

    // Temporal dimension parameter (as power of 2)
    public int DimT { get; set; } = 8;

    // Frequency dimension parameter
    public int DimF { get; set; } = 2048;
}

SimpleSeparationProgress

Progress information for the separation process.

SimpleSeparationProgress Class

public class SimpleSeparationProgress
{
    // Current file being processed
    public string CurrentFile { get; set; }

    // Overall progress percentage (0-100)
    public double OverallProgress { get; set; }

    // Current processing step description
    public string Status { get; set; }

    // Number of chunks processed
    public int ProcessedChunks { get; set; }

    // Total number of chunks
    public int TotalChunks { get; set; }
}

SimpleSeparationResult

Result of the audio separation operation.

SimpleSeparationResult Class

public class SimpleSeparationResult
{
    // Path to the vocals output file
    public string VocalsPath { get; set; }

    // Path to the instrumental output file
    public string InstrumentalPath { get; set; }

    // Processing duration
    public TimeSpan ProcessingTime { get; set; }
}

Separation Models

Available Models

Three built-in models optimized for different use cases, each embedded in the library for seamless usage:

Model	Quality	Speed	Best For
`Default`	Good	Fast	General purpose separation with balanced quality and processing time. Ideal for quick previews and batch processing.
`Best`	Excellent	Slower	Maximum quality separation for professional applications. Produces the cleanest vocal and instrumental tracks with minimal artifacts.
`Karaoke`	Specialized	Medium	Removes lead vocals while preserving background vocals. Perfect for creating karaoke tracks with choir or backing vocal presence.

Model Selection

InternalModel Enum

public enum InternalModel
{
    None,      // Use custom model file via ModelPath
    Default,   // Balanced quality and speed
    Best,      // Highest quality, longer processing time
    Karaoke    // Preserve background vocals, remove lead vocal
}

// Usage examples
var options1 = new SimpleSeparationOptions { Model = InternalModel.Default };
var options2 = new SimpleSeparationOptions { Model = InternalModel.Best };
var options3 = new SimpleSeparationOptions { Model = InternalModel.Karaoke };

// Custom model
var options4 = new SimpleSeparationOptions
{
    Model = InternalModel.None,
    ModelPath = @"C:\Models\custom_model.onnx"
};

Model Characteristics

Default Model: The most balanced option, providing good quality separation with relatively fast processing times. Suitable for most use cases where a reasonable quality-speed tradeoff is acceptable.
Best Model: Utilizes a larger and more complex neural network architecture, resulting in superior separation quality with cleaner vocals and instrumental tracks. Processing time is approximately 2-3x longer than the Default model.
Karaoke Model: Specifically trained to remove only the lead vocal while preserving backing vocals, harmonies, and vocal effects. This creates a more authentic karaoke experience compared to complete vocal removal.

HTDemucs Model - Advanced Stem Separation

For more advanced audio separation needs, the HTDemucs (Hybrid Transformer Demucs) model provides state-of-the-art multi-stem separation capabilities. Unlike the basic vocal/instrumental separation, HTDemucs can separate audio into four distinct stems:

Stem	Description	Use Cases
`Vocals`	Singing and speech	Karaoke, vocal analysis, remixing
`Drums`	Percussion instruments	Rhythm analysis, drum replacement
`Bass`	Bass guitar and low-frequency instruments	Bass isolation, mixing, mastering
`Other`	All other instruments (guitars, keyboards, strings, etc.)	Instrumental remixing, analysis

Model Availability

NUGET Package: The HTDemucs model is embedded as a resource in the OwnaudioNET NUGET package. You don't need to download or handle the model file separately when using the package!
Source Code: Due to GitHub size limitations, the source code repository does NOT include the HTDemucs model file. If you clone the repository, you must download the model separately from HuggingFace.

HTDemucs Usage Example

Using HTDemucs for Multi-Stem Separation

using OwnaudioNET.Features.HTDemucs;

// Create HTDemucs separator with embedded model
var options = new HTDemucsSeparationOptions
{
    Model = InternalModel.HTDemucs,  // Use embedded resource
    OutputDirectory = "output_htdemucs",
    ChunkSizeSeconds = 10,           // Chunk size (10-30s recommended)
    OverlapFactor = 0.25f,           // Overlap between chunks (0.25 = 25%)
    EnableGPU = true,                // Use GPU acceleration
    TargetStems = HTDemucsStem.All   // Extract all stems
};

using var separator = new HTDemucsAudioSeparator(options);
separator.Initialize();

// Progress tracking
separator.ProgressChanged += (s, progress) =>
{
    Console.WriteLine($"{progress.Status}: {progress.OverallProgress:F1}%");
    Console.WriteLine($"Chunks: {progress.ProcessedChunks}/{progress.TotalChunks}");
};

// Separate audio into stems
var result = separator.Separate("music.mp3");

Console.WriteLine($"Vocals: {result.VocalsPath}");
Console.WriteLine($"Drums: {result.DrumsPath}");
Console.WriteLine($"Bass: {result.BassPath}");
Console.WriteLine($"Other: {result.OtherPath}");
Console.WriteLine($"Processing time: {result.ProcessingTime}");

Selective Stem Extraction

Extract Only Specific Stems

// Extract only vocals and other instruments
var options = new HTDemucsSeparationOptions
{
    Model = InternalModel.HTDemucs,
    OutputDirectory = "output",
    TargetStems = HTDemucsStem.Vocals | HTDemucsStem.Other
};

// Extract only drums
var drumsOnly = new HTDemucsSeparationOptions
{
    Model = InternalModel.HTDemucs,
    OutputDirectory = "drums_output",
    TargetStems = HTDemucsStem.Drums
};

// Using helper methods
using var separator = HTDemucsExtensions.CreateDefaultSeparator("output_directory");
separator.Initialize();
var result = separator.Separate("music.mp3");

HTDemucs Performance

HTDemucs provides superior separation quality but requires more computational resources:

Hardware	Processing Speed	Example (3 min song)
CPU (16 cores)	10-15x realtime	~12-18 seconds
GPU (NVIDIA RTX 3060)	50-100x realtime	~2-4 seconds
GPU (NVIDIA RTX 4090)	100-150x realtime	~1-2 seconds

HTDemucs Model Download for Source Code Users If you're building from source code (not using the NUGET package), you must download the HTDemucs model file from HuggingFace due to GitHub's file size restrictions. The NUGET package includes the model automatically.

Multi-Model Averaging

The MultiModelAudioSeparator class takes audio separation quality to the next level by running multiple ONNX models in parallel and averaging their outputs. Each model independently processes the original audio, and the resulting vocals and instrumentals are averaged across all models — yielding cleaner separation with fewer artifacts than any single model alone.

How It Works Every model receives the same original audio. After processing, the vocals from all models are averaged together and the instrumentals from all models are averaged together. Models can output either vocals or instrumentals; the complementary stem is always derived by subtracting the model output from the original mix.

Key Features

Parallel Averaging

All models process the original audio independently in sequence per chunk, then vocals and instrumentals are averaged — (V₁+V₂+…+Vₙ)/n and (I₁+I₂+…+Iₙ)/n.

Auto OutputType Detection

The system automatically detects whether a model outputs vocals or instrumentals by inspecting ONNX metadata, model name, or file path. Explicit configuration is also supported.

Intermediate Results

Optionally save every model's individual output to disk for debugging, comparison, or further post-processing.

Mixed Model Types

Combine vocal-focused and instrumental-focused models in the same pipeline — each contributes its strengths, and averaging smooths out the differences.

MultiModelAudioSeparator Class

Namespace

using OwnaudioNET.Features.Vocalremover;

Public API Methods

Method	Return	Description
`Initialize()`	void	Loads and initializes all ONNX model sessions, auto-detects dimensions and output types
`Separate(string)`	MultiModelSeparationResult	Processes the audio file through all models and returns the averaged separation result
`Dispose()`	void	Releases all ONNX sessions and managed resources

Events

Event	Type	Description
`ProgressChanged`	EventHandler<MultiModelSeparationProgress>	Raised on every chunk, with per-model index, name, chunk count, and overall percentage
`ProcessingCompleted`	EventHandler<MultiModelSeparationResult>	Raised when all models have finished and the averaged result is ready

Data Classes

MultiModelSeparationOptions

MultiModelSeparationOptions Class

public class MultiModelSeparationOptions
{
    // List of models to include in the averaging pipeline
    public List<MultiModelInfo> Models { get; set; } = new();

    // Output directory for final and intermediate files
    public string OutputDirectory { get; set; } = "separated_multimodel";

    // Enable GPU acceleration (CUDA on Windows/Linux, CoreML on macOS)
    public bool EnableGPU { get; set; } = true;

    // Overlap margin in samples between chunks (default: 44100 = 1 s)
    public int Margin { get; set; } = 44100;

    // Chunk size in seconds (0 = process entire file at once)
    public int ChunkSizeSeconds { get; set; } = 15;

    // Save individual model outputs to disk (useful for debugging)
    public bool SaveAllIntermediateResults { get; set; } = false;
}

MultiModelInfo

MultiModelInfo Class

public class MultiModelInfo
{
    // Human-readable name shown in progress events and output filenames
    public string Name { get; set; } = "Model";

    // ONNX model file path (leave null/empty to use an embedded InternalModel)
    public string? ModelPath { get; set; }

    // Embedded model selection (Default, Best, Karaoke, …)
    public InternalModel Model { get; set; } = InternalModel.None;

    // FFT size for STFT (0 = auto-detect from ONNX metadata)
    public int NFft { get; set; } = 6144;

    // Time dimension as power of 2 (2^DimT frames)
    public int DimT { get; set; } = 8;

    // Frequency dimension
    public int DimF { get; set; } = 2048;

    // Disable noise reduction for this specific model
    public bool DisableNoiseReduction { get; set; } = false;

    // Save this model's individual output (independent of SaveAllIntermediateResults)
    public bool SaveIntermediateOutput { get; set; } = false;

    // Explicit output type — leave null for auto-detection
    public ModelOutputType? OutputType { get; set; } = null;
}

ModelOutputType Enum

public enum ModelOutputType
{
    // Model directly outputs the instrumental track.
    // Vocals = Original − Instrumental
    Instrumental,

    // Model directly outputs the vocal track.
    // Instrumental = Original − Vocals
    Vocals
}

MultiModelSeparationProgress

MultiModelSeparationProgress Class

public class MultiModelSeparationProgress
{
    public string CurrentFile { get; set; }        // File being processed
    public double OverallProgress { get; set; }    // 0–100 %
    public string Status { get; set; }             // Human-readable step description
    public int CurrentModelIndex { get; set; }     // 1-based model index
    public int TotalModels { get; set; }           // Total number of models
    public string CurrentModelName { get; set; }   // Name of the active model
    public int ProcessedChunks { get; set; }       // Chunks done for current model
    public int TotalChunks { get; set; }           // Total chunks for current model
}

MultiModelSeparationResult

MultiModelSeparationResult Class

public class MultiModelSeparationResult
{
    public string OutputPath { get; set; }         // Same as InstrumentalPath
    public string VocalsPath { get; set; }         // Averaged vocals WAV file
    public string InstrumentalPath { get; set; }   // Averaged instrumental WAV file
    public Dictionary<string, string> IntermediatePaths { get; set; }  // Per-model outputs
    public TimeSpan ProcessingTime { get; set; }   // Wall-clock processing duration
    public int ModelsProcessed { get; set; }       // Number of models that contributed
}

Usage Examples

Simple 2-Model Averaging

Using MultiModelExtensions Helper

using OwnaudioNET.Features.Vocalremover;

// Convenience factory: creates a 2-model averaging pipeline
using var separator = MultiModelExtensions.CreateSimplePipeline(
    model1: InternalModel.Best,
    model2: InternalModel.Karaoke,
    outputDirectory: "output"
);

separator.ProgressChanged += (s, p) =>
    Console.WriteLine($"[Model {p.CurrentModelIndex}/{p.TotalModels}] {p.OverallProgress:F1}%");

separator.Initialize();
var result = separator.Separate("song.mp3");

Console.WriteLine($"Vocals:       {result.VocalsPath}");
Console.WriteLine($"Instrumental: {result.InstrumentalPath}");
Console.WriteLine($"Time:         {result.ProcessingTime}");

Triple-Model Averaging

Using MultiModelExtensions.CreateTriplePipeline

using var separator = MultiModelExtensions.CreateTriplePipeline(
    model1: InternalModel.Best,
    model2: InternalModel.Default,
    model3: InternalModel.Karaoke,
    outputDirectory: "output_triple"
);

separator.Initialize();
var result = separator.Separate("song.flac");

// Final averaged outputs
Console.WriteLine($"Vocals:       {result.VocalsPath}");
Console.WriteLine($"Instrumental: {result.InstrumentalPath}");

// Intermediate per-model outputs (saved because the helper enables it)
foreach (var kv in result.IntermediatePaths)
    Console.WriteLine($"  {kv.Key}: {kv.Value}");

Full Control with Custom Options

Manual MultiModelSeparationOptions

var options = new MultiModelSeparationOptions
{
    Models = new List<MultiModelInfo>
    {
        new MultiModelInfo
        {
            Name = "Step1_BestQuality",
            Model = InternalModel.Best,
            NFft = 6144, DimT = 8, DimF = 2048,
            DisableNoiseReduction = false,
            SaveIntermediateOutput = true
        },
        new MultiModelInfo
        {
            Name = "Step2_Karaoke",
            Model = InternalModel.Karaoke,
            NFft = 6144, DimT = 8, DimF = 2048,
            DisableNoiseReduction = true
        }
    },
    OutputDirectory = "output_custom",
    EnableGPU = true,
    ChunkSizeSeconds = 15,
    Margin = 44100,
    SaveAllIntermediateResults = true
};

using var separator = new MultiModelAudioSeparator(options);

separator.ProgressChanged += (s, p) =>
{
    Console.Write($"\r[{p.CurrentModelName}] chunk {p.ProcessedChunks}/{p.TotalChunks} ({p.OverallProgress:F1}%)");
};

separator.Initialize();
var result = separator.Separate("input.wav");

Mixed OutputType — Vocal + Instrumental Models

Combining Models with Different Output Stems

var options = new MultiModelSeparationOptions
{
    Models = new List<MultiModelInfo>
    {
        new MultiModelInfo
        {
            Name = "VocalModel",
            ModelPath = @"models/vocal_model.onnx",
            OutputType = ModelOutputType.Vocals       // outputs vocal track directly
        },
        new MultiModelInfo
        {
            Name = "InstrumentalModel",
            ModelPath = @"models/instrumental_model.onnx",
            OutputType = ModelOutputType.Instrumental // outputs instrumental track directly
        }
    },
    OutputDirectory = "output_mixed",
    EnableGPU = true
};

// The averaging pipeline:
// - Vocal complement from model 2: original − instrumental
// - Instrumental complement from model 1: original − vocals
// - Final vocals:       (V₁ + V₂) / 2
// - Final instrumental: (I₁ + I₂) / 2

using var separator = new MultiModelAudioSeparator(options);
separator.Initialize();
var result = separator.Separate("song.mp3");

Custom ONNX Model Files with Auto-Detection

External ONNX Models — OutputType Auto-Detected

// OutputType is inferred from model filename:
//   "Voc_FT.onnx"   → contains "Voc"  → Vocals
//   "Inst_HQ.onnx"  → contains "Inst" → Instrumental
var options = new MultiModelSeparationOptions
{
    Models = new List<MultiModelInfo>
    {
        new MultiModelInfo { Name = "Voc_FT",  ModelPath = @"models/Voc_FT.onnx" },
        new MultiModelInfo { Name = "Inst_HQ", ModelPath = @"models/Inst_HQ.onnx" }
    },
    OutputDirectory = "output_custom_files",
    EnableGPU = true
};

using var separator = new MultiModelAudioSeparator(options);
separator.Initialize();
var result = separator.Separate("song.wav");

Auto-Detection of OutputType

When MultiModelInfo.OutputType is left as null, the system uses a three-strategy detection approach:

ONNX output metadata: Inspects the model's output node names for keywords such as vocal, voice, instrumental, karaoke, etc.
Model name / file path: Scans MultiModelInfo.Name and ModelPath for the same keywords.
InternalModel enum name: Falls back to the enum value name.

If none of the strategies yields a conclusive result, Instrumental is used as the safe default.

Advanced Configuration

Processing Parameters

Fine-tune the separation process with advanced STFT and chunking parameters:

Parameter	Default	Description
`ChunkSizeSeconds`	15	Length of each processing chunk in seconds. Smaller values use less memory but increase processing overhead.
`Margin`	44100	Overlap margin in samples between chunks. Prevents artifacts at chunk boundaries. Should be at least 0.5 seconds.
`NFft`	6144	FFT size for spectral analysis. Higher values provide better frequency resolution but increase computation time.
`DimF`	2048	Frequency dimension for model input. Auto-detected from model metadata if not specified.
`DimT`	8	Time dimension as power of 2 (2^8 = 256 time frames). Auto-detected from model metadata if not specified.
`DisableNoiseReduction`	false	When false, applies advanced noise reduction using phase inversion technique for cleaner results.

Hardware Acceleration

The service automatically detects and utilizes available hardware acceleration:

GPU Acceleration

// Automatic GPU detection during initialization
service.Initialize();

// Console output will show:
// "CUDA execution provider enabled." (if GPU available)
// OR
// "Using CPU execution provider." (fallback)

// Processing speed comparison:
// CPU: ~2-5x real-time (depends on CPU cores)
// GPU (CUDA): ~10-30x real-time (depends on GPU model)

Memory Management

For large files or systems with limited memory, adjust chunk size and margin:

Memory Optimization

// Low memory configuration (suitable for 4GB RAM)
var lowMemOptions = new SimpleSeparationOptions
{
    Model = InternalModel.Default,
    ChunkSizeSeconds = 10,  // Smaller chunks
    Margin = 22050         // 0.5 second margin
};

// High quality configuration (requires 8GB+ RAM)
var highQualityOptions = new SimpleSeparationOptions
{
    Model = InternalModel.Best,
    ChunkSizeSeconds = 30,  // Larger chunks for better quality
    Margin = 88200          // 2 second margin for smoother transitions
};

Progress Tracking

Monitoring Progress

Track separation progress in real-time using event handlers:

Progress Event Handling

var service = new SimpleAudioSeparationService(options);

// Subscribe to progress events
service.ProgressChanged += (sender, progress) =>
{
    Console.WriteLine($"[{progress.OverallProgress:F1}%] {progress.Status}");

    if (progress.TotalChunks > 0)
    {
        Console.WriteLine($"Chunk {progress.ProcessedChunks}/{progress.TotalChunks}");
    }
};

// Subscribe to completion event
service.ProcessingCompleted += (sender, result) =>
{
    Console.WriteLine($"\nSeparation completed in {result.ProcessingTime}");
    Console.WriteLine($"Vocals saved to: {result.VocalsPath}");
    Console.WriteLine($"Instrumental saved to: {result.InstrumentalPath}");
};

service.Initialize();
service.Separate("song.mp3");

Progress Stages

Loading audio file (0%): Decoding input audio to 44.1kHz stereo format
Processing audio separation (10%): Creating and processing audio chunks
Processing chunks (20-80%): Running neural network inference on each chunk
Calculating results (90%): Reconstructing full-length separated tracks
Completed (100%): Saving output files to disk

Factory Methods

AudioSeparationExtensions

Convenient factory methods for quick service creation:

Factory Methods

using OwnaudioNET.Features.Vocalremover;

// Create service with internal model
var service1 = AudioSeparationExtensions.CreateDefaultService(InternalModel.Best);

// Create service with custom model file
var service2 = AudioSeparationExtensions.CreateDefaultService("path/to/model.onnx");

// Create service with custom output directory
var service3 = AudioSeparationExtensions.CreatetService(
    InternalModel.Karaoke,
    @"C:\Output\Karaoke"
);

// Create service with custom model and output
var service4 = AudioSeparationExtensions.CreatetService(
    "custom_model.onnx",
    @"C:\Output\Custom"
);

SimpleSeparator Factory

Simplified factory for quick one-line initialization:

SimpleSeparator Usage

// Quick initialization and separation
var (service, _, _) = SimpleSeparator.Separator(
    InternalModel.Default,
    "output_folder"
);

var result = service.Separate("input.wav");
service.Dispose();

Helper Methods

Audio Validation & Estimation

// Validate audio file format
bool isValid = AudioSeparationExtensions.IsValidAudioFile("song.mp3");
// Supports: .wav, .mp3, .flac

// Estimate processing time
TimeSpan estimate = AudioSeparationExtensions.EstimateProcessingTime("song.wav");
Console.WriteLine($"Estimated processing time: {estimate}");

Usage Examples

Basic Vocal Removal

Simple Vocal Removal

using OwnaudioNET.Features.Vocalremover;

// Create service with default settings
var options = new SimpleSeparationOptions
{
    Model = InternalModel.Default,
    OutputDirectory = "separated"
};

using var service = new SimpleAudioSeparationService(options);
service.Initialize();

// Separate audio file
var result = service.Separate(@"C:\Music\song.mp3");

Console.WriteLine($"Vocals: {result.VocalsPath}");
Console.WriteLine($"Instrumental: {result.InstrumentalPath}");
Console.WriteLine($"Time: {result.ProcessingTime.TotalSeconds:F1}s");

High-Quality Separation with Progress

Best Quality with Progress Tracking

var options = new SimpleSeparationOptions
{
    Model = InternalModel.Best,
    OutputDirectory = "output_best",
    ChunkSizeSeconds = 20,
    DisableNoiseReduction = false  // Enable noise reduction
};

using var service = new SimpleAudioSeparationService(options);

// Progress tracking
service.ProgressChanged += (s, p) =>
{
    Console.Write($"\r[{p.OverallProgress:F1}%] {p.Status}");
    if (p.TotalChunks > 0)
        Console.Write($" - Chunk {p.ProcessedChunks}/{p.TotalChunks}");
};

service.ProcessingCompleted += (s, r) =>
{
    Console.WriteLine($"\n\nCompleted in {r.ProcessingTime}");
    Console.WriteLine($"Output files:\n  {r.VocalsPath}\n  {r.InstrumentalPath}");
};

service.Initialize();
service.Separate("input_song.flac");

Karaoke Track Creation

Creating Karaoke Tracks

// Use Karaoke model to preserve background vocals
var options = new SimpleSeparationOptions
{
    Model = InternalModel.Karaoke,
    OutputDirectory = "karaoke_tracks"
};

using var service = new SimpleAudioSeparationService(options);
service.Initialize();

// Process multiple songs
string[] songs = Directory.GetFiles(@"C:\Music\Album", "*.mp3");

foreach (var song in songs)
{
    Console.WriteLine($"\nProcessing: {Path.GetFileName(song)}");
    var result = service.Separate(song);
    Console.WriteLine($"Karaoke track created: {result.InstrumentalPath}");
}

Batch Processing with Factory Method

Batch Vocal Removal

// Create service using factory
var service = AudioSeparationExtensions.CreatetService(
    InternalModel.Default,
    @"C:\Output\Vocals_Removed"
);

service.Initialize();

// Process all WAV files in directory
var files = Directory.GetFiles(@"C:\Music", "*.wav");
int count = 0;

foreach (var file in files)
{
    try
    {
        Console.WriteLine($"\n[{++count}/{files.Length}] {Path.GetFileName(file)}");
        var result = service.Separate(file);
        Console.WriteLine($"✓ Completed in {result.ProcessingTime.TotalSeconds:F1}s");
    }
    catch (Exception ex)
    {
        Console.WriteLine($"✗ Error: {ex.Message}");
    }
}

service.Dispose();

Custom Model Configuration

Using Custom ONNX Models

// Load custom trained model
var options = new SimpleSeparationOptions
{
    Model = InternalModel.None,
    ModelPath = @"C:\Models\my_custom_separator.onnx",
    OutputDirectory = "custom_output",

    // Adjust STFT parameters if needed for custom model
    NFft = 4096,
    DimF = 1024,
    DimT = 9  // 2^9 = 512 time frames
};

using var service = new SimpleAudioSeparationService(options);
service.Initialize();

// Model parameters are auto-detected from ONNX metadata
var result = service.Separate("test.wav");

Performance Note Audio separation is computationally intensive. Processing time varies significantly based on:

Model choice: Best model is 2-3x slower than Default
Hardware: GPU acceleration provides 5-15x speedup over CPU
File length: Processing time scales linearly with audio duration
Chunk size: Larger chunks are more efficient but use more memory

Typical processing time on modern hardware: 2-10x real-time on CPU, 0.2-1x real-time on GPU.

Technical Features The Vocal Remover API includes advanced audio processing techniques:

STFT with Hanning window for accurate frequency-domain representation
Reflection padding to prevent boundary artifacts
Hermitian symmetry for proper inverse FFT reconstruction
Overlap-add synthesis with automatic windowing compensation
Phase inversion noise reduction for cleaner separation results
Automatic normalization to prevent clipping in output files
44.1kHz stereo processing for optimal quality

Supported Audio Formats The Vocal Remover supports common audio formats through the Ownaudio decoder:

WAV (uncompressed PCM)
MP3 (MPEG Audio Layer 3)
FLAC (Free Lossless Audio Codec)

Output files are always saved as 16-bit WAV at 44.1kHz stereo.

NET Vocal Remover API

Overview

Key Features

AI Model Selection

STFT Processing

Smart Chunking

Hardware Acceleration

SimpleAudioSeparationService Class

Public API Methods

Events

Data Classes

SimpleSeparationOptions

SimpleSeparationProgress

SimpleSeparationResult

Separation Models

Available Models

Model Selection

Model Characteristics

HTDemucs Model - Advanced Stem Separation

HTDemucs Usage Example

Selective Stem Extraction

HTDemucs Performance

Multi-Model Averaging

Key Features

Parallel Averaging

Auto OutputType Detection

Intermediate Results

Mixed Model Types

MultiModelAudioSeparator Class

Public API Methods

Events

Data Classes

MultiModelSeparationOptions

MultiModelInfo

ModelOutputType Enum

MultiModelSeparationProgress

MultiModelSeparationResult

Usage Examples

Simple 2-Model Averaging

Triple-Model Averaging

Full Control with Custom Options

Mixed OutputType — Vocal + Instrumental Models

Custom ONNX Model Files with Auto-Detection

Auto-Detection of OutputType

Advanced Configuration

Processing Parameters

Hardware Acceleration

Memory Management

Progress Tracking

Monitoring Progress

Progress Stages

Factory Methods

AudioSeparationExtensions

SimpleSeparator Factory

Helper Methods

Usage Examples

Basic Vocal Removal

High-Quality Separation with Progress

Karaoke Track Creation

Batch Processing with Factory Method

Custom Model Configuration

Related Documentation