为了正常的体验网站,请在浏览器设置里面开启Javascript功能!

人声处理相关

2017-11-13 19页 doc 64KB 16阅读

用户头像

is_079973

暂无简介

举报
人声处理相关人声处理相关 Sophisticated processing of human voice + universal knowledge base: what are VST, DX, VSTI, DXI? Singing recordings usually take two sources: an accompaniment, and a Mike sound. Among them, the accompaniment is the same as everyone, so the effect of the pro...
人声处理相关
人声处理相关 Sophisticated processing of human voice + universal knowledge base: what are VST, DX, VSTI, DXI? Singing recordings usually take two sources: an accompaniment, and a Mike sound. Among them, the accompaniment is the same as everyone, so the effect of the production is mainly reflected in the effect of human voice. This article mainly discusses the processing of human voice effects. Treatment of vocal effects, most people are using the regulation method and tentative, to find the best treatment effect of pitch. The lack of such tuning is quite obvious: (1) looking for an ideal tuning effect requires many assumptions and attempts, so it takes a long time. (2) a good tuning effect is often encountered by chance; it is not helpful to sum up the rules of tuning, and it is not easy to reproduce in the future. (3) the fixed parameters and adjustable parameters of different equipment are different, so the experience of using a device is usually not applicable to another device. At present the effect of processing equipment, technical means for changing audio tone is not too much, which are commonly used only frequency equalization, delay feedback, limiting distortion of 3 basic methods, but different parameters combination of these effects processing equipment produced by the sound will be quite different. The parameter settings of the effect processor can have many items, especially the delay feedback, and the setting of the simulation reverberation effect parameters can reach as many as dozens. Of course, these very strong parameters, most people are difficult to understand, and do not know how to understand. Therefore, most of the effect processing equipment only sets one or two adjustable parameters, and its adjustable range is also relatively narrow. This simple adjustment of the effect of processing equipment allows people to make tentative adjustments on them without too much trouble. But for more precise tuning of effect processing, for example, in a multi track recording system, a more specialized effect processing device must be used to make a more elaborate effect. Frequency equalization Obviously, the more segments of frequency equalization, the higher the fine degree of effect processing. In addition to graphic equalization, the equalization units of the general tuning usually only have three or four bands, which obviously can not meet the requirements of accurate processing of sound sources. In order to be flexible enough to perform arbitrary equalization of voice, we propose to use four frequency equalization with adjustable gain, frequency, and width. Most frequency balanced adjustable parameters have only one gain, however, this does not mean that the other two parameters do not exist, and these two parameters are not adjustable fixed parameters. Of course, it's not difficult to set these two parameters adjustable, but they increase the cost of the equipment and complicate the adjustment. Therefore, the parameter equalization circuit with adjustable gain, frequency, and width can only be seen on the high-end equipment. In fact, the gain, frequency, and width are adjustable frequency balancing, and it is almost impossible to find an ideal tone using the "trick Guessing" method. Here we have to study the physical characteristics, technical parameters of audio signals, and the corresponding relation between them. The spectrum distribution of voice source is rather special, The pronunciation, he has three parts: one is produced by the vocal cords vibration noise, this part of the pronunciation is the most flexible, the spectral changes of different pitch and different pronunciation has greatly; two is the shape of the nasal cavity is relatively stable, so the resonance caused by harmonics spectrum distribution changes little three is the sound of friction; oral airflow in between teeth, the dental and vocal cords vibration generated by independent music. Frequency equalization can roughly separate the three parts of the spectrum. The frequency range for adjusting the nasal sounds is below 500Hz, the equilibrium midpoint frequency is generally 80~150Hz, and the balanced bandwidth is 4 octave. For example, 100Hz can be defined as the midpoint of the frequency equalization, and the equilibrium curve should be from 100~400Hz to gentle transition, and the range of balanced gain can be +10Db~-6dB. What should be reminded here is that the listening box for this adjustment shall not use a small case with a low frequency of pronounced sounds to avoid the inadvertent aggravation of the nasal sounds. The spectrum of the tone of the human voice varies greatly with the pitch, so the equilibrium curve for adjusting the musical tone should be very smooth, with a balanced midpoint frequency of 1000~3400Hz, and a balanced bandwidth of six octave. This band controls the brightness of the singing voice, adjusts upwards and gently increases the brightness of the human voice. However, if you want to reduce the brightness of your voice, the situation will be more complicated. General bright voice pitch over most of the 2500Hz spectrum in the vicinity of the strong, here we can balance the bandwidth of 1/2 octave, balanced gain equalization of about -4dB, looking for a best effect of frequency can be in the vicinity of 2500Hz. The voice over 4kHz dental spectrum distribution. Because of this band also contains some noise spectrum, so that the regulation of dental frequency is 6~16KHz, bandwidth of 3 octave equalization, equalization point frequency 10~12KHz in general, to gain maximum balance can be adjusted to +10Db; for the lower voice loudness dental, should be used for 1/2 octave bandwidth equalization, equalization for the balance of midpoint frequency of 6800Hz, the lowest to the equalizer gain dropped to -10Db. Can be seen from the above analysis, the voice frequency equalization, and to highlight a melodic band, try to use the broadband equilibrium curve. This is to make the vocal music, the twang, the three part of the uniform distribution of dental spectrum coherence, which sounds natural and smooth. Theoretically, the loudness of a person's voice should be kept constant when it is making any sound. In order to deal with the specific effects on the basis of not destroying the natural sense of life, we can use 1/5 octave balanced processing, specifically in the following situations: (1) the lack of thickness, narrow pitch, attenuation processing can use 1/5 octave at 800Hz, the maximum attenuation in -3dB. (2) the retroflex sibilants, screaming, "hush" sound lack of clarity, attenuation processing can use 1/5 octave at 2500Hz, the maximum attenuation in -6Db. For equalization of sound sources, it is better to use equalizer that can display equalizer curves. The balanced gain button on the general digital equalizer is identified by "G", and the "balanced frequency" button is marked with "F", The balanced bandwidth adjustment button is marked with "F" or "Q". delay feedback Delay feedback is the most widely used, but also the most complex, approach to effect processing. Among them, reverb, chorus, edge, echo and so on, the basic processing methods are delayed feedback. 1, reverb The reverb effect is mainly used to increase the fusion of sound sources. The sound delay array of natural sound source is very dense and complex, so the program of simulating reverberation effect is complex and changeable. Common parameters are as follows: The reverberation time: a set of complex procedures are digital reverb on realistic simulation of natural reverberation, which although there are a lot of technical parameters can be adjusted, but these technical parameters adjustment is always less than the original effect is more natural, especially the reverberation time. High frequency roll down: this parameter is used to simulate the absorption effect of air on high frequency in natural reverberation, so as to produce more natural reverberation effect. Generally adjustable range of high frequency mixing drop is 0.1~1.0. When this value is high, the reverberation effect is closer to the natural reverberation. When this value is lower, the reverberation effect is more clear. Dispersion: this parameter adjustable reverberation sound array density growth rate, the adjustable range of 0~10, its value is higher, more rich, warm reverberation effect; its value is low, it is empty, lonely reverberation effect. Pre delay: the establishment of the natural reverberation acoustic array is delayed for a period of time, and the pre delay is set for the analog secondary effect. Sound density: this parameter can adjust the density of sound array. When the value is high, the reverberation effect is more warm, but there is obvious sound dyeing. When the value is lower, the reverberation effect is deeper, and the shearing sound is weaker. Frequency modulation: This is a technical parameter, because the electronic reverberation sound array density than natural reverberation is sparse, in order to make the reverberation is smooth and coherent, the need for reverberation sound array delay time modulation. This technique can effectively eliminate the segment split sound of the delay acoustic array, and can increase the soft feeling of reverberation sound. Depth of treatment: refers to the depth of modulation of the frequency modulation circuit. Reverb type: the natural reverberation sounds of different rooms are also different from each other, and this difference can not be represented by one or two parameters. In digital reverb, different natural reverberation requires different programs. Its options are generally small hall (S-Hall), hall (L-Hall), room (Room), random (Random), anti reverberation (Reverse), steel plate (Plate), Sprirg (spring), etc.. The reverberation of small hall and lobby room is a natural reverberation effect, and the reverberation of steel plate and spring can simulate the effect of early mechanical reverberation. Room size: This is for natural reverberation. It's easy to understand. Room activity: activity is the reverberation intensity of a room. It is related to the sound absorption characteristic of the room wall. This parameter is used to adjust this characteristic. The balance of early reflected sound and reverberation sound: early reflections and treatment effect characteristics of reverberation close and reverberation sound array pitch is less so this part of the be the most changeful, the two generation digital reverb is separate from, This parameter is used to adjust the loudness balance between the early reflections and the reverberant acoustic array. The time between the early reflections and the reverberant sounds: the time delay between the early reflections and the reverberation sounds. This time is longer, the front of the reverberation effect is more clear; this time is short, the early reflection sound and the reverberation sound will overlap together, the reverberation effect front segment is more cloudy. In addition to the above adjustable parameters, the reverberation effect has some other subsidiary parameters, such as low pass filtering, high pass filtering, and direct / reverberant loudness balance control. 2, delay Delay is the effect of delaying the audio source for a period of time before it can be played again. Depending on the delay time, chorus, border, echo and so on can be generated respectively. When the delay time is between 3~35ms, the human ear can not feel the presence of hysteresis, and when it is superimposed with the original source, it will have a "comb filter" effect due to its phase interference, which is the edge effect. If the delay time is above 50ms, the delay tone is clearly discernible, and the processing effect at this time is the echo. Echo processing is generally used to produce simple reverb effects. Time delay, chorus, border, echo and other adjustable parameters are almost the same, there are several specific items: * delay time (Dly), that is, the delay time adjustment of the main delay circuit. * feedback gain (FBGain), that is, delay feedback gain control. * feedback high frequency ratio (HiRatio), that is, high frequency attenuation control on the feedback loop. * modulation frequency (Freq) refers to the frequency modulation period of the main delay. * modulation depth (Depth) refers to the modulation depth of the FM circuit. * high frequency gain (HF) means high frequency balanced control. * pre delay (IniDly) refers to the adjustment of the pre delay time of the main delay circuit. * balanced frequency (EQF), where frequency equalization is used for tone adjustment, which is balanced by mid point frequency selection. Because the delay effects are more complex, if it is not the effect processing expert, it is recommended to use the preset parameters provided by the equipment, because these preset parameters give better processing results. Acoustic excitation The audio signal amplitude limit depth of sound processing, it will produce a similar "saturated" effect so that the pitch sound loudness increasing effect without increasing the actual loudness. Some are also equipped with digital effects on nonlinear saturation effect, he is the amplitude of the signal, analog signal in nonlinear large battery transistor on the saturation caused by, resulting in a "hard" sound effect. Because of limiting distortion caused by the extra harmonic component is generated, thus the new actuator design, in order to make the treatment effect of soft, are all based on the sound source in the home for high-order carrier components to simulate the amplitude limit distortion, creating less "hoarseness" acoustic excitation effect. In addition, a high pass filter for enhancing high harmonics is used to process the original signal, and then superimposed on the delayed original signal to create a clear sound effect of the head. Obviously, this approach can produce less noisy incentive handling. The excitation process is similar to the overload distortion of the audio device, so an excessive excitation of the sound source produces an unpleasant sense of noise. Due to the early sound equipment fidelity is not high, people have been accustomed to the kind of slightly noisy sound, and for sound clean high fidelity audio, but not in the habit, feeling the pronunciation too weak. In the voice source, most of the speeches are short of strength, except for a small number of specially trained people, so the motivation is necessary. There are several ways to deal with human voices: (1) the excitation spectrum of the tone of human voice, whose spectral distribution is centered at 2500Hz. The effect of this kind of stimulation is more natural and comfortable, and the effect of increasing the prominence of sound source is obvious. (2) the excitation spectrum of nasal sounds is 500Hz. This kind of motivation can effectively increase the human voice. (3) motivating people near 800Hz can increase the noise of the audio source. Of course, the use of this method should be very careful, preferably for rock and roll. (4) the excitation spectrum should not be applied to the spectrum within the 3500-6800Hz range of the human voice, because it makes the audio source produce unpleasant noisy sounds easily. (5) to the general dental voice should avoid the use of incentives, because the distortion of this band is very easy to be aware of. Of course, if the digital exciter used incentive effect is relatively soft, also can do the incentive treatment for mild to dental, dental for exacerbation of clear sense. Its processing frequency should be above 7200Hz. The motivational processing of singing pronunciation is usually conservative. In the actual sound, sound processing effect of incentive may gradually weakened with long time listening and so on, adjust the incentive effect, not more than 10 minutes. For voice stimulation, it is better to use digital effects processor. It usually has the following adjustment parameters: 1. input gain (Gmn) used to regulate the input level. Be careful not to overload the device. The 2. tuning frequency (Tuning) is selected according to the frequency band to be processed, and a suitable frequency is selected. 3. drive level (Drive) used to adjust the depth of excitation. When the driving level is large, the effect is relatively noisy; the driving level is small, and the effect is relatively mild. 4. mixing ratio (Mix), that is, the loudness ratio of the original signal to the effect signal. Overall planning for effect processing For the fine handling of voice sources, use 1 full digital mixers, at least 3 digital effects and a digital exciter. First, in the console, using the channel equalization control unit to adjust the tone voice, to make it sound can be improved, here are some common examples. (1) a band of frequencies near the 8OOHz can cause a feeling of boredom, so that the maximum attenuation of 15dB in this frequency band is 1 / 5 octave, which is used to improve the total impression of human voice; (2) 68O0Hz near the band to produce voice scream, harsh feeling, can be up to 10dB in this frequency band attenuation, the band width is L / 5 octave, Screaming to weaken the sense of dental; (3) for those who feel too bright and have a deep ear stick, a maximum attenuation of 8dB can be given at 3400Hz, with a frequency range of 1 / 3 octave; (4) for nasal excessive weight, can be in 500Hz below frequency appropriate attenuation, attenuation bandwidth is 3 octave; (5) UHF sibilants influenced by the sensitivity of the human ear, the need for 12KHz to improve 6dB (band width is 2 octave), music and vocals to balance its loudness. The above equalization is more suitable for field amplification. If there is a multi track recording or program forwarding, the adjustment of the gain shall be halved. After equalizing, adjust the actuator. The first actuator drive level and mixing level to maximum state, frequency tuning in 2500Hz, if the pronunciation has been noisy, or sound is excellent, can reduce the drive level, should pay attention to this adjustment is a change in the source of the hardness. If the drive level at a higher position, but only the mixing level is lower, high hardness and sound sound remains unchanged, but it will be without the processing of excitation acoustic cover slightly. This phenomenon is more obvious when the depth of motivation is very strong. Among them, the former is the original sound, and the latter can produce two layers of sound. It has the effect of increasing the level of human voice. In general, 1 actuators can only handle one frequency band, and many single function actuators do not need to connect in parallel. They can only be connected in series. If you need a plurality of frequency bands of sound source and motivation, proposed here shown in the figure of devices connected, reverber should be used with multiple effect incentive processing (such as YAMAHASPX990), then you can deal with 500Hz, 800Hz and 7200Hz bands with the driver, with the incentive function of processing 2500Hz band reverb on. Once again, it is encouraging that the adjustment time can not be too long, so as to avoid the ear fatigue, can not accurately identify the degree of excitation is appropriate. Finally, adjust the reverb effect. Here the reverberation effect includes two aspects, one is the basis of retouching, the other is a strong staining. The basic processing is mainly Polish reverberation, in order to increase the understanding of the source, but they cannot let people hear a reverberation room. The reverberation strong dyeing effect, is mainly used to generate the rendering of sits as source. The processing method has the following 3 kinds of situations: (1) creating a sense of space. Use room or room reverb. The simulation sound obvious natural reverberation effect, is a simple and effective way of reverberation, the effect of channel 3500Hz near the band slightly improve, can produce high brightness through sound good. Of course, there is one drawback: the effect is cloudy and sometimes comes with a "muffled pot" sound. (2) generate echoes. Long delay time delayed feedback processing can simulate Valley echo effect; the processing delay time is generally in tune with the rhythm of singing songs. In order to make the effect more remote, it can attenuate some of the bands below 1600Hz and above 3800Hz. Analog Valley echo effect, there are many digital processor on the program available for use. (3) generate sound background. The effect of beautification effect lingering around the reverberation sound voice is very effective, almost all of the vocal to use reverb. In the pronunciation become muddy, or cause the tank under the premise of "sound, we think that the reverberation effect of the more the better, but the reality is often reverberation effect is very weak, its pronunciation has become muddy, and caused the obvious" pot "sound. In order to produce a sound background, without causing the sound to become muddy or causing a "jar" sound. The following is recommended as follows: time delay, reverb and series connection. The delay time of this kind of processing is generally 200-600ms, feedback gain 40%-60%, reverberation uses hall reverberation effect, reverberation time is 2-8s. The reverberation effect after series processing requires smoothness and consistency. If the sound head showed after the treatment, then can make the following adjustments, one is to shorten the delay time, the two is to increase the reverberation loudness, reverberation time is three. The strong staining effect of reverberation, the general should be in the premise of retouching, so you can handle some of the weak strong staining. ================================================================================== VST VST is the abbreviation of Technology Virtual Studio, he is a software effect technology based on Steinberg, basically in the form of plug-ins, can run on most of today's professional music software, can support ASIO hardware platform with low delay provide very high quality effects processing. To achieve the best effect of VST (that is, very low latency), the sound card supports ASIO. The effect of VST is used for covering the effect of almost all music production, but also because of the openness of VST technology, many large companies, small companies, and even the VST effect of countless personal development, some are quite successful effect is quite practical, even Hollywood movies are used in the top VST plugins provide. At present, most of the commercial recording studio in China use the VST mixer in the late stage of mixing and processing. Able to use the VST plugin Music software called VST host, used Samplitude (7 and later), Cubase VST32, Cubase SX, Nuendo Wave, Lab, FruityLoops, Orion, Project5, Audition and so on. The VST effector is used to process audio, so it is loaded into the audio track, and the MIDI rail cannot use the VST effector. DX DX is the abbreviation of DirectX, he is a software technology effect DirectX interface technology based on Microsoft, basically in the form of plug-ins, you can run in the 99% PC professional music software (literally), able to lower the delay to provide very high quality treatment effect in support of WDM hardware platform. The effect of DX is used for covering the effect of almost all music production, but also because of the openness of DirectX technology, many large companies, small companies, and even the DX effect of countless personal development, some are quite successful effects, very practical, Even Hollywood's film production uses the top effects offered by these DX plug-ins. Able to use the DX plugin Music software called DX host DX host, more than any other type of plug-in host (said to have 99%), commonly used with Samplitude, Cubase, Sound, Forge, Wave, Lab, SONAR, Cakewalk, FruityLoops, Orion and so on. The DX effector is used to process audio, so it is loaded into the audio track, and the MIDI rail cannot use the DX effector. VSTi VSTi Virtual Studio Technology Instruments abbreviation, he is Steinberg based on virtual instrument technology, basically in the form of plug-ins, can run on most of today's professional music software, able to provide very high quality low delay effect in support of ASIO hardware platform. To achieve the best effect of VSTi (that is, very low latency), the sound card supports ASIO. The VSTi soft synthesizer is different from the VST effect. He controls the MIDI rail, and each VSTi plug-in provides you with a lot of timbre, and a rich parameter control that allows you to create your own unique tone. Different VSTi have different tone synthesis methods, wave table synthesizer, analog synthesizer, FM synthesizer and VSTi can be competent. The music software that can use these VSTi plug-ins is called the VSTi host, and we have Samplitude (7 later versions), Cubase, VST32, Cubase, SX, FruityLoops, Orion, Project5, and so on. The VSTi virtual instrument can be considered a soft audio source, so it can only be loaded on the MIDI rail. DXi DXi DirectX is the abbreviation of Instrument, is the technology of soft synthesizer Cakewalk company in DirectX based on independent development, also exists in the form of plug-in basically, now runs only on SONAR (Note: Cakewalk does not support DXi, Cakewalk to 9 was discontinued, replaced by SONAR), can provide very high quality to sound synthesis low delay in support of WDM hardware platform. The DXi soft synthesizer is different from the DX effect. He controls the MIDI rail, and each DXi plug-in provides you with a lot of timbre, and a rich parameter control that allows you to create your own unique tone. Different DXi have different tone synthesis methods, wave table synthesizer, analog synthesizer, FM synthesizer and DXi can be competent.
/
本文档为【人声处理相关】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。 本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。 网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。

历史搜索

    清空历史搜索