streamtotext.audio module¶
Audio source and audio processing compoenents
The two main base classes are AudioSource which provides audio and
AudioProcessor which act as a pipeline processor on another
AudioSource.
-
class
streamtotext.audio.AudioBlock¶ Bases:
objectAn iterator over
AudioChunk.Blocks are used to deliniate continuous chunks of audio. As an example, when using the
SquelchedSourceaudio source a consumer often would like to know at what points the squelch was triggered on and off.-
end()¶
-
ended¶
-
-
class
streamtotext.audio.AudioChunk(start_time, audio, width, freq)¶ Bases:
tupleA sequence of audio samples.
This is the low level structure used for passing audio. Typically these are obtained from iterating over an
AudioBlock.In order to make this object use minimal memory it is implemented as a namedtuple.
Parameters: - start_time (int) – Unix timestamp of the first sample.
- audio (bytes) – Bytes array of audio samples.
- width (int) – Number of bytes per sample.
- freq (int) – Sampling frequency.
-
audio¶ Alias for field number 1
-
freq¶ Alias for field number 3
-
start_time¶ Alias for field number 0
-
width¶ Alias for field number 2
-
class
streamtotext.audio.AudioPlayer(source, width, channels, freq)¶ Bases:
objectPlay audio from an audio source.
This is not generally useful for transcription, but can be very useful in the development of
AudioSourceorAudioProcessorclasses.Parameters: - source (AudioSource) – Source to play.
- width (int) – Bytes per sample.
- channels (int) – Number of channels in output device.
- freq (int) – Sampling frequency of output device.
-
play()¶ Play audio from source.
This method will block until the source runs out of audio.
-
class
streamtotext.audio.AudioSource¶ Bases:
objectBase class for providing audio.
All classes which provide audio in some form implement this class. Audio is obtained by first entering the
listen()context manager and then iterating over theAudioSourceto obtainAudioBlock.-
listen()¶ Listen to the AudioSource.
Ret: Async context manager which starts and stops the AudioSource.
-
start()¶ Start the audio source.
This is where initialization / opening of audio devices should happen.
-
stop()¶ Stop the audio source.
This is where deinitialization / closing of audio devices should happen.
-
-
class
streamtotext.audio.AudioSourceProcessor(source)¶ Bases:
streamtotext.audio.AudioSourceBase class for being a pipeline processor of an
AudioSourceParameters: source (AudioSource) – Input source -
start()¶ Start the input audio source.
This is intended to be called from the base class, not directly.
-
stop()¶ Stop the input audio source.
This is intended to be called from the base class, not directly.
-
-
class
streamtotext.audio.EvenChunkIterator(iterator, chunk_size)¶ Bases:
objectIterate over chunks from an audio source in even sized increments.
Parameters: - iterator (Iterator) – Iterator over audio chunks.
- chunk_size (int) – Number of samples in resulting chunks
-
class
streamtotext.audio.Microphone(audio_format=None, channels=1, rate=16000, device_ndx=0)¶ Bases:
streamtotext.audio.AudioSourceUse a local microphone as an audio source.
Parameters: - audio_format – Sample format, default paInt16
- channels (int) – Number of channels in microphone.
- rate (int) – Sample frequency
- device_ndx (int) – PyAudio device index
-
start()¶
-
stop()¶
-
exception
streamtotext.audio.NoDefaultInputDeviceError¶ Bases:
Exception
-
exception
streamtotext.audio.NoMoreChunksError¶ Bases:
Exception
-
class
streamtotext.audio.QueueAudioBlock(queue=None)¶ Bases:
streamtotext.audio.AudioBlock-
add_chunk(chunk)¶
-
-
class
streamtotext.audio.RateConvert(source, n_channels, out_rate)¶
-
class
streamtotext.audio.SingleBlockAudioSource¶
-
class
streamtotext.audio.SquelchedBlock(source, squelch_level)¶
-
class
streamtotext.audio.SquelchedSource(source, sample_size=1600, squelch_level=None, prefix_samples=4)¶ Bases:
streamtotext.audio.AudioSourceProcessorFilter out samples below a volume level from an audio source.
This is useful to prevent constant transcription attempts of background noise, and also to correctly create a ‘trigger window’ where transcription attempts are made.
A sliding window of prefix_samples size is inspected. When the rms of prefix_samples * sample_size samples surpasses the squelch_level this source begins to emit audio. Once the rms of the sliding window passes below 80% of the squelch level this source stop emitting audio.
Parameters: - source (AudioSource) – Input source
- sample_size (int) – Size of each sample to inspect.
- squelch_level (int) – RMS value to trigger squelch
- prefix_samples (int) – Number of samples of sample_size to check
-
static
check_squelch(level, is_triggered, chunks)¶
-
detect_squelch_level(detect_time=10, threshold=0.8)¶
-
start()¶
-
class
streamtotext.audio.WaveSource(wave_path, chunk_frames=None)¶ Bases:
streamtotext.audio.SingleBlockAudioSourceUse a wave file as an audio source.
Parameters: - wave_path (string) – Path to wave file.
- chunk_frames (int) – Chunk size to return from get_chunk
-
start()¶
-
stop()¶
-
streamtotext.audio.chunk_sample_cnt(chunk)¶ Number of samples which occured in an AudioChunk
Parameters: chunk – The chunk to examine.
-
streamtotext.audio.merge_chunks(chunks)¶
-
streamtotext.audio.split_chunk(chunk, sample_offset)¶