Application case of self-service bank personnel gathering detection technology

SaveSavedRemoved 0
Deal Score0
Deal Score0

[ad_1]

I. Introduction

The application of intelligent visual monitoring system on ATM has become a safe and effective measure to maintain bank self-service.

With the promulgation and implementation of the “Regulations on Risk Levels and Protection Levels of Bank Business Places” and “Security Regulations for Bank Self-Service Equipment and Self-Service Banks”, the security of bank business halls, self-service equipment, self-service banks and other places has been further improved in the form of regulations. strengthen. In terms of security prevention technology, self-service banks have adopted intelligent detection technologies such as people gathering, face shielding, and people wandering for automatic early warning and prevention to ensure the safety of users’ lives and property.

Second, the principle and classification of personnel detection technology

At present, personnel detection technology is gradually applied in automatic banking, mainly for the application of large-scale scenarios of self-service banking and ATM. The relationship between the number of people is estimated; the other is the detection method based on individual characteristics, which takes the human body model as the research object, detects a single person, and finally counts the number of people. However, in the scenario of self-service banking, environmental changes often have a certain impact on the algorithm, such as light changes, camera movements, etc. Therefore, the algorithm needs to be robust to environmental changes.

3. Crowd aggregation detection algorithm based on group mode

The population-based model mainly refers to taking the entire population as the analysis target, and then obtaining the quantitative characteristics of a single person through training, and then estimating the number of individuals in the population. There are two main estimation methods for this type of algorithm: one is to obtain the foreground crowd target from the Gaussian background model, and then to establish the correspondence between the foreground target area (the number of image pixels) and the number of people in the crowd through training to estimate the number of people. In the scene of self-service bank or ATM, due to the existence of perspective phenomenon, the relative distance between the crowd and the lens seriously affects the accuracy of the estimation of the number of people, which is not suitable for the banking scene. Therefore, this method is more suitable for larger scenes and places where the judgment of crowd congestion is relatively simple.

Another way takes into account the influence of perspective phenomenon, and estimates the number of people by obtaining the real area (the number of non-image pixels) occupied by the crowd and a single person in the actual space. Two planes parallel to the ground are obtained at the top of the human body’s head and the bottom of the feet, respectively, and two projection areas are obtained by projecting the human body on the two planes respectively. The overlapping area of ​​the two areas is obtained, and this area is the area occupied by the human body in the real scene. This method effectively improves the influence of perspective phenomenon on the algorithm, and can accurately obtain the area occupied by the crowd in the real scene. For the area occupied by a single person in the real scene, the average and standard deviation of the area of ​​a single person are obtained by taking multiple video sequence samples of a single person at different positions in the target scene for training.

It can be seen from the above two algorithm methods that the implementation process of the algorithm must meet the following conditions: (1) the camera position and angle are standardized; (2) the light in the target scene cannot change rapidly; (3) the people in the crowd The distance should not be too different. First, due to different environments, it is difficult to standardize the position and angle of the camera installation, thus limiting the generalization of the application of this algorithm; secondly, it is difficult to ensure the stability of the ambient light of the bank due to the influence of the external environment; finally, its The accuracy depends on the uniformity of the distance between people. If the crowd is uneven, it may cause large errors. Therefore, although the accuracy of this algorithm has been improved, it still cannot meet the needs of self-service banking or ATM environments.

4. Crowd aggregation detection algorithm based on individual mode

1. Color-based detection algorithm

The color-based detection method generally determines the human face by detecting the skin color of the human face; and then counts the number of people in the preset target area. First, a skin color model needs to be established. The methods mainly include: Gaussian skin color model based on color space YCgCb or YCgCr, adaptive brightness segmented elliptical skin color model, and mixed skin color model based on HSV and RGB. Second, the color image is segmented into skin-color regions and non-skin-color regions using color information. Finally, use mathematical morphology or low-pass filtering to denoise the skin color area to achieve specific positioning and marking of the face area, and finally count the number of people.

Although the implementation algorithm of this method is not complicated, it is more sensitive to the influencing factors such as video capture quality and ambient lighting, and has poor robustness. In addition, when the faces are facing away from each other, the skin color area can hardly be collected, and the method is invalid. Therefore, the method of people counting based on skin color has defects such as high environmental requirements, small scope of application and low accuracy.

2. Detection algorithm based on moving target

The target detection algorithm is mainly based on the characteristics of people moving in the scene. First, the moving foreground area is extracted by analyzing the image sequence, and then the personnel target is obtained by processing the foreground area. The information of moving foreground is mainly through the method of background subtraction, which includes three aspects: background modeling, foreground detection and background update.

Background modeling is to accurately find the part belonging to the background from a series of video images and store it as a background image. Currently, single Gaussian background modeling is a more commonly used method. To detect the foreground is to compare the current video image with the background modeling, find the foreground target, subtract the background, and update the background. When some changes cause the original background modeling to be no longer suitable for the current video image, the background model is updated in real time.

However, this method has its own weaknesses. In the process of detecting moving objects, shadows are easily generated around the moving objects due to the influence of illumination. The shadows also move along with the moving objects, so they are used in the foreground together with the moving objects. extracted. Since the background is updated based on the current frame, the moving target will be partially integrated into the background; there is a certain error between the updated background and the actual background, resulting in a “smearing phenomenon”. The above problems will have an impact on personnel detection or the statistics of the number of personnel, making it slightly inferior in dealing with problems such as overlapping, but there are also many algorithms that have improved it to varying degrees. This algorithm is mostly combined with other algorithms to detect people; in addition, for people targets that have not moved for a long time in the target area, this algorithm will integrate them into the background and treat them as backgrounds, and subsequent algorithms will not be able to extract them.

3. Detection algorithm based on head and shoulders

The head-and-shoulders-based detection algorithm takes human body characteristics as the research object, and judges the number of people by extracting the characteristics of the human body in the image. In self-service banking or ATM scene videos, people are mostly standing or walking, so the outer contour of the head and shoulders is relatively stable, which can be extracted as human body features.

The HOG (Histograms of Oriented Gradients) algorithm mainly performs feature extraction on the head and shoulders of the personnel. The HOG algorithm mainly performs histogram statistics on the direction of the image boundary to obtain the feature vector, and then uses the support vector to classify the feature vector, and then obtains the head and shoulders area and the non-head and shoulders area, and finally achieves the purpose of detecting the head and shoulders. This method not only avoids the limitation of using color as a feature, but also can detect stationary targets, so it is more robust and accurate than moving person detection. In addition, for crowds gathered together, the algorithm is not affected by the density of the crowd. As long as the head and shoulders of the people can be observed, the algorithm can be applied, which also conforms to the visual characteristics of human eye observation.

(1) Region block normalization

In order to improve the accuracy, the contrast of these local histograms can also be normalized in the area block of the image. This method first calculates the density of each histogram in this area block, and then according to the density value. Each grid cell is normalized, and after this normalization, better stability can be obtained for lighting changes and shadows.

(2) Synthetic feature vector

Since the selected area block is the processing object, the feature vector is formed in the unit of the area block. The combination method arranges the histogram formed by the three levels of small cells in the area block into the feature vector in order, which constitutes the histogram description. son. These area blocks overlap each other, and the output of each cell unit acts on the final descriptor multiple times, and appears in the final feature vector with different values, which greatly improves the classification results.

(3) Gradient calculation

Since this algorithm is the statistics of the boundary information, that is, the description of the target gradient image distribution, it first uses a one-dimensional template.[-1,0,1]and its transpose to gradient the original image to obtain image edge information.

(4) Build a direction histogram

A region block is segmented from the existing gradient image as the processing object; this region block can traverse the entire image to search for the head and shoulders region in the image. Each area block can be divided into three types of small cells of different sizes. The area block can be divided into 2×2 first-level rectangular small cells, and the first-level small cells can be divided into 2X2 M-level rectangular small cells. By analogy, it is divided into three levels.

(5) Support Vector Machine (SVM) Classifier

First, input the HOG feature vector of the sample into the SVM; use the positive and negative training sets to train the SVM, find an optimal hyperplane as the decision function, and then obtain the SVM classifier, and finally use the trained SVM to perform the output image analysis. Classify to get head and shoulders area and non head and shoulders area.

V. Conclusion

To sum up, the individual-based algorithm is more accurate than the crowd-based algorithm, and is more suitable for the accuracy of the number of people in the self-service banking scenario. Given that the HOG algorithm does not depend on color.

1

[ad_2]

We will be happy to hear your thoughts

Leave a reply

RFID made in China
Logo
Enable registration in settings - general
Compare items
  • Cameras (0)
  • Phones (0)
Compare