When I was preparing for the autumn move , Every time I see those great gods holding seven or eight big factories on Niuke online offer, I can't envy it , At that time, my level of machine learning algorithm could not be put on the table , At most, I read several courses 、 Did two game levels , Then the game hasn't got any ranking , I really began to worry so much , Many small friends around have started internships in big factories , And I really ask myself , Almost zero internship experience , The algorithm has not been systematically deduced ,leetcode Only occasionally brush , About twenty questions have been brushed , It's not too much to describe the state at that time with disaster , I'm starting to panic , Then I began to calm down and find all kinds of experience posts , Niuke is a gathering of big men , I've read about ten articles here , I've read several articles on , I summed up several key investigation directions of autumn recruitment algorithm post ： Self introduction. 、 Project introduction 、 Algorithm details 、 Data structure and algorithm , Next, let me talk about the details of these four directions one by one .

- Self introduction.

Good introduction , It will leave a deep impression on the interviewer , And this piece is all worded by yourself , The key point to highlight is to control yourself , So from your narrative , The interviewer can hear your familiarity with the project and your depth of thinking , therefore , It is particularly important to prepare in advance , During the interview, each project should be described according to a certain logic , Of course, the most important thing in the algorithm project is the data 、 features 、 Model 、 effect , According to this framework, I made it clear , The interviewer listened with ease , The next interview stage will also be more fluent , Because the interviewer will capture the key words in your self introduction , I'll ask you a series of questions in the Q & a session later , This implies a tip, That's what you said , Be sure to understand better than the interviewer , Don't move in those vague things , Or sooner or later it will show up .

- Project introduction

That's the top priority , The project will reflect the overall quality of an interviewer . What should we do if we have no internship experience ？ Easy to handle , Go to the game , Try to get the place in the competition , Then make the algorithms used in the game familiar , The interviewer is bound to expand on the details of your project , Let you dissect some of his doubts , For example, the selection of positive and negative samples 、 Feature handling 、 Details of the model , Another example is that you used a tree model in your game , You have to know all the knowledge points related to the tree model , Let me just give you a few examples ：XGBoost Why not be sensitive to missing values ？ Compared with ordinary GBDT,XGBoost How to deal with missing values ？ Why? xgboost/gbdt Why the depth of the tree is rarely able to achieve high accuracy when adjusting parameters ？ Wait, this kind of detailed question , Once you answer vaguely , The interviewer is bound to deduct points , So don't take chances , I don't think the interviewer will ask , Murphy's theorem tells us , Anything that can go wrong is bound to happen , These can be prepared , It's almost an open book exam , Why not think about how to answer in advance , I have to go to the examination room and rack my brains to think of a different answer . If the competition platform , You must know , Like Ali Tianchi 、kaggle And so on are all platforms for everyone to participate in the competition .

- Algorithm details

In addition to examining the details of the algorithms that appear in the project , You will also ask questions about the basis of your machine learning algorithm , Let me summarize some of the more important , Traditional algorithms : Logical regression 、 Naive Bayes 、 Tree model （random forest/Adaboost/xgboost/lightgbm）、SVM、PageRank、 clustering ; Some theories of machine learning , Nonequilibrium problem 、 Over fitting problem , Cross validation problem , Model selection ; Recommendation system : Collaborative filtering 、FM/FFM、LS-PLM、Wide&Deep、DeepFM、DIN、DIEN、ESMM、Embedding、 Recall 、EE、 Performance evaluation ; These are the core of the algorithm post , Besides , Investigation of some code languages , It will also be something that some interviewers value very much , such as C++/python/Spark etc. , In order to prepare for this , I turned all the questions I could think of into questions and answers , Prepare yourself by asking and answering questions , I listed all the questions at the end of the article . You can prepare according to this , Or choose some of them . Because my job is to recommend algorithms , Therefore, it will pay more attention to this aspect , Later, if I have time , It can be extended to other fields , Like natural language processing 、 Computer vision and so on . Recommended books , It is also some experience of Niuke's great God , expericnce 《 Statistical learning method 》、《 Baimian machine learning 》、《 In depth learning 》、《 Deep learning recommendation system 》、 Zhou's 《 machine learning 》. Of course, reading these books is definitely not enough , Reading it doesn't mean you have mastered , You follow the list of questions I made for you , Answer it in your heart , Or write it directly , This effect is absolutely top , When I recruit in autumn , It's almost a second , In your own hands . I will also issue these questions in my official account , Basically finished , You can also see my website , Website 、 The official account is introduced below , Welcome to exchange .

- Data structure and algorithm

This is also quite critical , Some companies will even decide your future according to your performance in this field , Like the headline is the famous master of dynamic programming , I always like to take some intermediate or hard The topic , It's a headache . Many non computer students have no foundation , Practice is not in place , You may not think clearly on the spot . My suggestion is to brush according to the topic first , Like dynamic programming 、 Sliding window theme 、 Double pointer 、 Speed pointer 、topK wait , Brush up 200 The question is about , Then you can brush at random , Be sure to brush more questions , This is an interview rule that cannot be overemphasized a hundred times . The recommended books are 《 The finger of the sword offer》, Website, you can find leetcode Chinese net .

Okay, talk less , The best time to plant a tree is ten years ago , The second is now , If you think your algorithmic skills are not enough, you can gain in the process of autumn moves offer Words , Then from now on, conquer one by one , It's not hard . I will also update my algorithm and related questions and answers on my website and official account 、 The interview experience 、 Push and so on , If you are interested, you can pay attention to a wave , My expectation for this number is to do it for a long time , I will also do it with my heart , I hope it can bring real help to the students of the algorithm .

- Website ：http://ml-union.cn
- official account ： Opiate algorithm
- Qr code for public account ： as follows

Here are the questions I sorted out above , I will keep updating this list , The main writing area in the later period is in the official account , It is estimated that Niuke will not come often , Welcome to my wechat , Communicate together .

- Personal wechat ：ayao-algo

## Machine learning is full of questions and answers

### Logical regression

- Let's deduce the loss function of logistic regression , And explain what it means .
- In the advertisement LR In the model , Why do feature combinations ？
- Why? LR The model uses sigmoid function , What is the mathematical principle behind ？ Why not use other functions ？
- Why? LR It can be used to predict the click through rate ？
- Data meeting what conditions are used LR best ？ let me put it another way , in order to LR Work better , What to do with the data ？
- Can logistic regression solve the nonlinear classification problem ？
- To one with m Samples n Data sets of dimensional features ,LR What is the dimension of the gradient in the algorithm ?
- Why does the logistic regression loss function use the maximum likelihood estimation instead of the least square method ？
- How to solve the parameters of logistic regression ？
- SVM and LR What are the similarities and differences ？ Under what circumstances ？
- Why? LR res MSE？
- Why does logistic regression need to discretize the features first ？
- parallel LR The implementation of the
- Logical regression （Logistic regression） What are the applications in the financial field ？

### Naive Bayes

- What is Bayesian decision theory ？
- Do you know what naive Bayes is ？
- There are men in the company 60 people , Women have 40 people , The number of men wearing leather shoes is 25 people , The number of people wearing sneakers is 35 people , The number of women wearing leather shoes is 10 people , The number of people wearing high heels is 30 people . Now all you know is that one person is wearing leather shoes , Then you need to guess what his gender is . If you infer that he is a man, the probability is greater than that of a woman , Then think he's male , Otherwise I think he's a woman .
- Can you tell me about the advantages and disadvantages of naive Bayes ？
- “ simple ” This is the disadvantage of naive Bayes in prediction , So there is such an obvious assumption, the disadvantage is , Why can naive Bayesian prediction still achieve good results ？
- What is Laplace smoothing ？
- Is there any super parameter in naive Bayes that can be adjusted ？
- How many models are there in naive Bayes ？
- Do you know what applications naive Bayes has ？
- Is naive Bayes a high square error or low variance model ？
- What are the assumptions of naive Bayes ？ What are the advantages and disadvantages ？
- How naive Bayes estimates parameters ？
- What's the difference between Bayesian school and frequency school ？
- What is the difference between logistic regression and naive Bayes ？

### Tree model

- Talk about your understanding of entropy 、 Understanding of information gain and information gain ratio ？
- ID3 What is the division standard of the algorithm ？
- ID3 What's wrong with the algorithm ？C4.5 How does the algorithm solve ID3 Defective ？ID3 and C4.5 What's the flaw ？
- C4.5 How to deal with missing values ？
- C4.5 What are the criteria for the division of ？
- C4.5 What are the shortcomings of the algorithm ？
- What is the definition of Gini coefficient and its advantages ？
- CART How to select partition features when eigenvalues are missing ？
- Select the division feature ,CART How should the model deal with the samples missing the eigenvalue ？
- Reasons for over fitting of decision tree and its solutions ？
- What are the strategies for decision tree pruning ？ What are their advantages and disadvantages ？
- C4.5 What is the pruning method used ？
- CART How to deal with category imbalance ？
- CART How to deal with continuous values ？
- Please tell me ID3、C4.5 and CART The difference between the three .
- CART Why did the algorithm choose gini Index ？
- C4.5 How does the algorithm deal with continuous values ？
- How does the decision tree deal with missing values ？
- How to calculate the importance of each feature of the decision tree ？
- If there are many features , Must the last unused feature in the decision tree be useless ？
- Does the decision tree need normalization ？
- Since the use of neural network can also solve the classification problem , that SVM、 What is the significance of these algorithms in decision tree ？
- Relationship between decision tree and conditional probability distribution ？
- CART What is your pruning strategy ？
- If it is caused by outliers or uneven data distribution , What impact will it have on the decision tree ？
- What are the advantages of decision tree compared with other models ？
- The difference between decision tree and logistic regression ？
- What is the difference between classification tree and regression tree ？
- How to understand the loss function of decision tree ?
- sklearn Whether the decision tree in should be used one-hot code ？
- Briefly describe the steps of random forest
- Whether the random forest will appear ？
- Why not divide random forest into training set and test set ？
- How random forests deal with missing values ？
- Random forest and GBDT The difference between
- Random forest and SVM Comparison
- Talk about the advantages and disadvantages of random forest
- Briefly Adaboost The weight update method of
- Let's deduce Adaboost Sample weight update formula
- During training , Why is there always the problem of wrong classification in each round of training , Whole Adaboost But it can converge quickly ？
- Adaboost Advantages and disadvantages ？
- AdaBoost And GBDT What are the similarities and differences of comparison ？
- Please give us a brief introduction GBDT Principle
- Why can regression trees be used as GBDT Iterative learning machine ？
- GBDT How it is used to classify problems ？
- Why? GBDT take CART The regression tree is divided into m A binary tree （ Each tree has only two leaf nodes ）, Instead of asking for a tree m+1 A binary tree of layers （ At most 2m A leaf node ）？
- GBDT How to regularize ？
- gbdt Why do we use negative gradients instead of ？
- GBDT What are the advantages of ？
- GBDT What is the role of reduction in ？
- Why based on residuals GBDT It's not a good choice ？
- Why is it said in the gradient lifting tree that the negative gradient of the objective function with respect to the current model is the approximate value of the residual ？
- Why? xgboost/gbdt Why the depth of the tree is rarely able to achieve high accuracy when adjusting parameters ？
- Why in the actual kaggle In the game ,GBDT and Random Forest The effect is very good ？
- GBDT How to use it in click through rate prediction ？
- GBDT How to calculate the gradient in ？ The gradient of who to whom ？
- m×n Data sets , If you use GBDT, So how many dimensions is the gradient ？ Or it's related to the depth of the tree ？ Or it is related to the number of leaf nodes of the tree ？
- Random forest and GBDT The differences and similarities
- In machine learning algorithm GBDT And Adaboost What are the differences and connections between ？
- Introduce to you XGBoost Principle
- XGBoost And GBDT What's the difference
- RF and GBDT The differences and similarities
- XGBoost Why use Taylor's second-order expansion
- XGBoost How to implement the parallelization part of ?
- XGBoost Why fast ？
- XGBoost How to calculate the weight of leaf nodes in the middle ？ Why can leaf node score be used to measure the complexity of the tree ？
- XGBoost Stop growing condition of a tree in
- Please deduce Xgboost
- XGBoost What are the methods to prevent over fitting ？
- XGBoost How to deal with unbalanced data
- Compare LR and GBDT, Tell me what situation GBDT Not as good as LR
- XGBoost How to prune trees in
- Use XGBoost Training model , If it is fitted, how to adjust the parameters ？
- XGBoost How to choose the best splitting point ？
- XGBoost Of Scalable How sex is reflected
- XGBoost How to evaluate the importance of features
- XGBooost General steps of parameter tuning
- XGBoost If the model is over fitted, how to solve it
- XGBoost Why not be sensitive to missing values ？ Compared with ordinary GBDT,XGBoost How to deal with missing values ？
- XGBoost How to realize the regularization of ?
- XGBoost and LightGBM The difference between
- XGBoost How to ask Hessian The inverse of a matrix ？
- xgboost In the algorithm, the approximate algorithm is used to obtain the segmentation points ？
- LightGBM Compare with XGBoost What are the advantages and disadvantages ？
- Please introduce several common integrated learning frameworks ：boosting/bagging/stacking
- Why integrated learning is better than a single learner ？
- Please briefly describe the meaning of variance and deviation of the model ？
- Must the base model in ensemble learning be a weak model ？
- Please calculate the overall expectation and overall variance of the model
- Why? Bagging The base model in must be a strong model ？
- Why? Boosting The base model in the framework must be weak ？

### Feature Engineering

- Machine learning , What are the engineering methods for feature selection ？
- In the ad click through rate model ,LR, GBDT+LR, FM, DNN Advantages and disadvantages of such model ？ What's the actual effect ?
- Multi label （multi-label） Data learning problem , What are the commonly used classifiers or classification strategies ？

### SVM

- About SVM in , Yes, constant C The understanding of the ？
- machine learning SVM About why the function interval can be set to 1？
- Machine learning has a lot to say about kernel functions , What is the definition and function of kernel function ？

### optimization algorithm

- optimization algorithm
- What is gradient descent method ？
- Training with gradient descent SVM What's the problem ？
- least square 、 Maximum likelihood 、 What's the difference between gradient descent ？
- In optimization problems , Why does Newton method need fewer iterations than gradient descent method ？
- Why? nn The bigger problem is that it will fall into local optimization , The convex function is not selected as the activation function ？

### Loss function

- Please explain the definition of loss function
- Please talk about your understanding of logistic regression loss function
- Please talk about your understanding of the square loss function .
- Please talk about your understanding of exponential loss function .
- Please tell me about you Hinge Understanding of hinge loss function .
- Please return to logic and SVM Compare the loss function of .
- For logical regression , Why is the square loss function nonconvex ？
- How to make SVM The derivation of is related to the loss function ？
- How neural networks design their own loss function, If you need to modify or design your own loss, What rules to follow ？
- softmax and cross-entropy What's the relationship ？
- Why is the loss function of neural network non convex ?
- What loss functions are commonly used in deep learning ( Optimize the objective function )？
- Neural network , Design loss function What are the techniques ?
- Neural network , Why not take the partial derivative of the loss function and make it equal to zero , Find the optimal weight , Instead, use the gradient descent method （ iteration ） Calculate weight ？
- When using the cross entropy loss function , I just want to punish 0.4~0.6 Such a fuzzy value , How to change ？

### Regularization

- Please explain the meaning of regularization .
- What is the relationship between regularization and a priori distribution of data ？
- L1 Compared with L2 Why is it easy to get sparse relief ？
- L1 Why can regularization make the coefficient become 0？L1 How to deal with 0 Point non differentiable case ？
- Deep learning how to prevent over fitting ？
- Multiple... Are used simultaneously in the objective function L1 and L2 The case of regularization term , How to solve ？

### AUC

- Please explain AUC.
- AUC And accuracy must be positively correlated ？ Is there any internal relationship ？
- Accuracy 、 Recall rate 、F1 value 、ROC、AUC What are the advantages and disadvantages of each ？
- Why? accuracy、precision、f1-score、recall They all scored high, but AUC Low score ？
- Machine learning ,F1 and ROC/AUC, How to do index evaluation for multi classification ？
- How to solve offline and online problems auc Inconsistent with the online click through rate ？
- Why? AUC Insensitive to the proportion of positive and negative samples
- AUC How much does it take ？
- AUC A probabilistic explanation for .

### Unbalanced data

- What are the processing methods of unbalanced data sets in machine learning ？
- Please give us a brief introduction SMOTE How does the sampling method deal with unbalanced data ？
- The original SMOTE What's wrong with the algorithm ？ How to improve ？
- Please give us a brief introduction Tomek Links Under sampling method .
- Please give us a brief introduction NearMiss Method
- EasyEnsemble How does the algorithm solve the problem of unbalanced data ？
- BalanceCascade How does the algorithm solve the problem of unbalanced data ？
- SMOTE Oversampling and Tomek Links Can undersampling algorithms be combined ？

## The recommendation system is full of questions and answers

### Shallow model

- Please briefly describe user based collaborative filtering UserCF The recommendation process .
- In user based collaborative filtering , How to calculate the similarity of users ？
- User based collaborative filtering UserCF What are the shortcomings ？
- Please briefly describe the article based collaborative filtering ItemCF The recommendation process .
- Please briefly describe the off-line engineering implementation of item based collaborative filtering algorithm （spark）
- The large-scale sparse matrix multiplication in the calculation process of collaborative filtering algorithm spark Existing solutions in （multiply） What are the drawbacks ？ How to improve ？
- Please briefly describe the advantages and disadvantages of collaborative filtering algorithm
- In real business , In order to make collaborative filtering more valuable to the business , What problems should we pay attention to when using this algorithm ？
- Collaborative filtering algorithm will have the problem of cold start , Mainly reflected in ？
- What recommended business scenarios can collaborative filtering algorithms be used for ？
- Please briefly describe the engineering implementation of near real-time collaborative filtering algorithm
- What is the principle of matrix decomposition ？ What are the main methods to solve ？
- How to understand the matrix decomposition model from the perspective of deep learning model
- Matrix decomposition algorithm , The length of the hidden vector k How does the value of affect the effect and project cost ？
- Please briefly describe the process of singular value decomposition . What are the defects of singular value decomposition ？ Why not apply to the solution in the Internet scenario ？
- Please briefly describe the gradient descent method to solve user - The process of implicit vector of goods .
- How to solve the problem of scoring deviation between users and items in matrix decomposition ？
- The logistic regression model predicts that compared with collaborative filtering , What is the biggest advantage ？ What is the recommendation process ？
- Please deduce the mathematical form of logistic regression .
- Please deduce the process of solving the parameter update of logistic regression by gradient descent method .
- Logistic regression as CTR What are the advantages and disadvantages of the prediction model ？
- In industry , Very few use continuous values as LR The feature input of the model , Instead, the continuous features are discretized into a series of 0、1 features , What are the advantages ？
- CTR All features in the estimate are crossed in pairs , What are the shortcomings of the method of giving weight to all combinations ？
- FM What is the principle of ？ What is the connection with matrix decomposition ？
- FM Compare with POLY2 Why is generalization better ？ What are the advantages in Engineering ？
- FFM Compare with FM What's the improvement ？
- FM What is the training complexity of ？ How to deduce ？FFM What is the training complexity of ？
- Why? GBDT It can be used for feature selection and feature combination ？
- GBDT+LR In the composite model ,GBDT How to generate eigenvectors ？
- GBDT+LR What are the advantages and disadvantages ？
- Please briefly describe what Ali's mother put forward LS-PLM The principle and mathematical form of the model .
- LS-PLM What are the advantages of the model ？
LS-PLM What is the relationship between the model and the deep learning model ？

### Depth model

- Please briefly Deep Crossing Network structure .
- Deep Crossing What is the role of the residual element in ？
- NeuralCF Based on the matrix decomposition model , What improvements have been made ？
- PNN Comparison NeuralCF and Deep Crossing What are the improvements ？ What are the advantages ？
- PNN What are the ways of feature intersection in ？ What's the difference ？
- PNN What are the advantages and limitations of the model ？
- How to understand Wide&Deep Model Memorization？
- Wide&Deep Medium Memorization What are the disadvantages ？
- How to understand Wide&Deep Model Generalization？
- Wide&Deep Medium Generalization What are the disadvantages ？
- Please briefly Wide&Deep Model structure of .
- Wide&Deep In training, we used Joint Training, What are the benefits ？
- wide&deep How does the author apply wide&deep To make recommendations ？
- Why? wide&deep For model ftrl and adagrad Two optimization methods ？
- Wide&Deep What are the innovations and advantages of the model ？
- In the application scenario , Which features are suitable for Wide Side , Which features are suitable for Deep Side , Why? ？
- wide&deep Why should continuous features be discretized in the model ？
- Deep&Cross comparison Wide&Deep What improvements have been made ？Deep&Cross Model Cross How does the network operate ？
- FNN What is the main purpose of the model to solve ？ How it was solved ？
- FNN What is the model structure of ？
- FNN How to use... In the model FM To initialize the Embeddiing Of layer parameters ？
- DeepFM What is the motivation for the proposal ？
- DeepFM in FM Layer and NN Layers are shared features Embedding What are the benefits of ？
- DeepFM Compare with Wide&Deep What's the improvement ？ Why is it so improved ？
- NFM comparison Wide&Deep What's the improvement ？ Why is it so changed ？
- What are the characteristics of users' interest in goods ？DIN How to capture these characteristics of user interest ？
- Capture of user interests , What methods do we usually have ？
- DIN How to handle the input of ？
- DIN How is the activation unit designed ？
- Why add cross product as input ？
- Why use simple MLP Realization AU Well ？
- DIN The activation function is used in Dice Replace the classic PReLU Activation function , What are the advantages ？
- DIN An adaptive regularization algorithm is used , What is its motivation ？
- DIN Used in the paper GAUC As an evaluation indicator , What are its benefits ？
- DIEN What is the motivation for introducing sequence information ？
- Please draw DIEN The structure of each layer ？ Interest extraction layer and interest evolution layer .
- CVR Estimate what to estimate ?
- And CTR Different estimates ,cvr What are the data sparsity and sample selection bias faced in the estimation ？
- ESMM How to solve the problem of sample selection deviation ？
- ESMM How to solve the problem of data sparsity ？
- ESMM The structure of is based on “ ride ” The relationship design is not based on what is the reason for division ?
- Explain it. ESMM The objective function of

### Embedding

- Embedding Why is technology important for deep learning recommendation systems ？
- Please briefly Word2Vec Principle and structure of .
- In order to speed up Word2Vec Training for , What methods have been taken ？
- Talk to you about Item2Vec The understanding of the , What are its limitations ？
- Please briefly describe the structure of the two tower model , What is the function of the item tower ？
- The more complex the model structure, the better ？ The more features, the better ？
- DeepWalk What is your main idea ？ Say the algorithm steps
- Node2Vec What does homogeneity and structure mean in ？ They and DFS and BFS What is the corresponding relationship between ？
- Please write out Node2Vec Jump probability formula between nodes .
- Illustrate with examples Node2Vec The intuitive explanation of homogeneity and structure in Recommendation System .
- EGES The proposal of is mainly to make up for DeepWalk Defective , How to make up for it ？
- Please briefly EGES Structure of model , And briefly describe the practice of each layer .
- Embedding What are the applications in the deep learning recommendation system ？ List three directions .
- Embedding What are the problems with training as a deep learning model ？
- Embedding What are the pre training methods , Separately .
- Please briefly Embedding As a process of recall layer .
- Please briefly describe the principle of local sensitive hash and its role in Recommendation System .

### Recall

- Suppose the number of inventory reaches the level of millions , How to design methods to recommend to users from this level of quantity top10 The items , At the same time, it can reduce the pressure of calculation ？
- Why sorting is more concerned than recall ？
- What are the characteristics of the recall model that are significantly different from the ranking model ？
- Why not just take " The exposure did not click " Make a negative sample of the recall model ？
- How does the recall model randomly sample negative samples ？
- What are the drawbacks of using random samples as negative samples ？ How to solve ？
- Why is there a recall in the recommendation system ？ What are the similarities and differences between recall and ranking in the recommendation system ？
- Recommend how system recall is implemented item The pressure of ?
- CTR What are the goals of the prediction and recommendation system gap？
- Reality recommendation system only by " forecast CTR" Sort ？
- Why? CTR Estimates apply only to those with “ True negative ” Sample scenario ？
- Which scenarios in the recommendation system cannot obtain true and negative samples ？ How to solve ？
- Please briefly describe based on embedding Recall method for , What are the advantages ？
- Airbnb Recall algorithm listing embedding How to select positive and negative samples for recall ？
- Airbnb Recall algorithm user/listing-type embedding How to select positive and negative samples for recall ？
- Facebook Of EBR How the algorithm selects positive and negative samples ？
- Why does the recall require isolation user And item Decoupling of features ？ How to decouple ？
- In the recall scenario , Why often use Pairwise LearningToRank To build the relative accuracy of sorting ？
- Optimize recall Pairwise LearningToRank What forms of loss function can be used ？
- Pinterest Of PinSAGE How to build a positive sample ？
- Please briefly DSSM The principle of the model
- DSSM The input layer maps text into a low dimensional vector space and transforms it into a vector What's the problem ？
- Please briefly describe DSSM How is it applied to recall ？ What is the structure ？
- Please give us a brief introduction DSSM Advantages and disadvantages
- DSSM Why is the negative sample in random sampling , without “ The exposure did not click ” When the negative sample ？
- Please briefly describe Baidu's twin tower model
- Please briefly YoutubeDNN The structure and principle of
- Please briefly describe the user multi interest network MIND What is your starting point , Describe its structure ？
- SDM How to combine the long-term and short-term interests of users ？ How is its structure ？

### Feature Engineering

- Feature engineering that allows you to design a recommendation system , How would you design ？ Including user side 、 Item side and contextual features .
- During feature processing , How to deal with continuous features ？
- During feature processing , How to deal with category features ？

### Explore and use

- When new users register or new items are put into storage , How to provide users with satisfactory recommendation services , And how to recommend new items , Recommend it to users who like it ？
- Briefly describe the meaning of exploration and utilization .
- Greedy What is the principle of the algorithm ？ What are the drawbacks ？
- Please briefly Thompson Sampling Principle and steps of the method .
- Excuse me, UCB How to solve the problem of exploration and utilization in cold start ？
- sketch LinUCB The principle and specific practice of .

### Characteristic evaluation

- How does the real-time performance of the model affect the effect of the recommendation system ？
- How does the client recommend real-time features in real time ？
- How does the stream computing platform perform quasi real-time feature processing ？
- Distributed storage system HDFS And the role of distributed batch processing platform in recommendation ？
- Please briefly offline/nearline/online Training methods and steps .
- Please name several indicators for offline evaluation .
- Please say P-R curve 、ROC Curves and AUC The meaning and relationship of .
- How to skillfully draw ROC curve ？
- Offline AUC promote , Will it certainly lead to the improvement of online indicators ？ Why? ？
- Why AB test ？ What are the advantages over offline evaluation ？
- AB How does the test design the layering and shunting mechanism ？

# Deep learning

- Please write down the commonly used loss function , Loss of square 、 Cross entropy loss 、softmax Loss function and hinge Loss function .
- Why is the training of deep neural network very difficult ？ What are the main reasons .
- Please illustrate forward propagation and back propagation with examples
- What is the function of introducing nonlinear activation function into deep learning ？
- Please name the commonly used activation functions , And draw their corresponding images .
- How to choose the activation function ？ Please describe the characteristics of various activation functions .
- Relu What are the advantages of activation functions ？
- Please explain Softmax Definition and function of activation function ？Softmax How the activation function applies to multiple classifications ？
- In depth model training , Why batch size? How to choose the right one batch size, Have and influence on the results ？
- Please explain BN Principle , Why batch normalization ？
- What is model tuning fine tuning？ Please explain fine-tuning Three states of the model , What are the characteristics of each ？
- Why unsupervised pre training can help deep learning ？
- What are the methods for initializing weight deviation ？ Explain their characteristics .
- What is the role of setting the learning rate ？ What are the common learning rate attenuation methods ？ Explain their respective characteristics
- What are the methods to prevent over fitting in deep learning ？
- Please name several common optimization algorithms , And their respective characteristics .
- How to balance variance and deviation in deep learning ？ If the deviation is too large, what should we do ？ The variance is too large ？
- Please explain Dropout Principle , During training and testing dropout What's the difference ？
- Data enhancement methods commonly used in deep learning ？
- How to understand Internal Covariate Shift？

## C++ A hundred questions and answers

### Basics

- What is the role of variables ？ What is the syntax for creating variables ？
- C++ What is the function of constants in ？ Please write down two ways to define constants .
- Please give me a few C++ Examples of pre reserved keywords in
- short type 、int type 、long The type and long long What is the memory space occupied by each type ？
- sizeof What is the function of keywords ？
- What is the memory space occupied by character variables ？ What are the characteristics of it in storage ？
- Please give me some examples of C++ Escape characters in ？
- C++ The difference between pre increment and post increment is ？
- Write an example of a ternary operator ？ And explain .
- switch case In the sentence break What is the role of ？
- One for The starting expression in the loop statement 、 Conditional expression 、 What is the execution order of the end loop body and loop statements ？
- break Statement and continue What is the function of the sentence ？

### Array

- What are the characteristics of arrays ？ How to define an array ？
- What is the relationship between the name of a one-dimensional array and its memory address ？
- How to define a two-dimensional array ？ What is the relationship between the name of a two-dimensional array and its memory address ？

### function

- Explain the meaning of form participation arguments .
- What is the meaning of value passing ？ What are the effects on formal and arguments ？
- What function declarations do ？

### The pointer

- What does the pointer do ？ What is the difference between pointer variables and ordinary variables ？
- How much memory space does the pointer occupy ？
- Constant pointer 、 What's the difference between pointer constants ？
- What's the difference between value passing and address passing ？

### Structure

- How to create a structure ？ Please write down two methods .
- How to create a structure array ？
- Structure pointer how to access members of a structure ？
- How structures nest structures ？ For example
- Can a structure pass parameters to a function as parameters ？

### Memory

- Please briefly C++ When the program executes, each memory block （ Code section 、 Global area 、 The stack area 、 Heap area ） Functional characteristics of .
- new What is the function of the operator ？ How do you use it? ？

### quote

- What is the function of reference ？ What is the essence of it ？
- When a reference is used as a function parameter , And value passing 、 What's the difference between address passing ？
- What are the functions and writing methods of constant references ？
When writing function default parameters , What do you need to pay attention to ？

### heavy load

- What conditions need to be met for function overloading ？

### encapsulation

- What is the meaning of encapsulation ？
- What are the access permissions for the members and behaviors of the class ？ What's the difference ？
- What's the difference between a class and a structure ？
- What are the advantages of setting member properties private ？

### Initialization and cleanup

- What are the functions of constructors and destructors ？
- What is the constructor Syntax ？ What are the characteristics of constructors ?
- What is the destructor Syntax ？ What are the characteristics of destructors ?
- What are the constructor call rules ？
- Please explain C++ Deep copy and shallow copy in ？
- C++ What is the syntax for initializing the list in ？
- B There are objects in the class A As a member ,A For object members , When creating a B Object time ,A And B The order of construction and deconstruction is who comes first and who comes second ？
- What are the characteristics of static members ？
- Are member variables and member functions stored separately in a class ？ Do non static member variables occupy object space ？
- this What does the pointer do ？
- const What effect does decorating member functions have ？ keyword mutable What is the role of ？
- C++ What is the role of Chinese friends ？ Global function 、 class 、 How are member functions implemented as friends ？
- What are the ways of inheritance ？ What is its authority ？
- Can a subclass inherit the private members of the parent class ？
- What is the constructor and destructor order of parent and child classes ？
- When a member with the same name appears between a subclass and a parent class , How to use subclasses , Access to data with the same name in a child or parent class ？
- What problems does diamond inheritance bring ？C++ How to solve it in ？
- What is the difference between static polymorphism and dynamic polymorphism ？
- What are the satisfaction and use conditions of polymorphism ？
- What are the advantages of polymorphism ？
- What is the meaning of pure virtual functions ？ What is grammar like ？ What does it have to do with abstract classes ？
- Explain the meaning of virtual destruct and pure virtual destruct 、 Grammar and its differences ？
- How to create a function template ？ What does it do ？ What to pay attention to ？
- What is the difference between a normal function and a function template ？ What are the calling rules ？
- What is the purpose of materializing the function template to solve ？
- What is the role of class templates ？ What is grammar like ？ What's the difference with function templates ？
- When to create a member function in a class template
- Please explain STL In the container 、 Algorithms and iterators .

## python A hundred questions and answers

- python in list、tuple、dict、set What's the difference between other types ?
- What are the forms of function arguments ？ What are the characteristics of each ？
- Please explain python Default parameter trap problem .
- Please give an example to illustrate the difference between shallow copy and deep copy
- What are the concepts of generator and iterator ？
- Please briefly describe the built-in functions zip Usage of . When the length of iterators is inconsistent , How is it handled , Is there any alternative ？
- Higher order function map/reduce/filter/sorted What are the usages of ？ Illustrate with examples .
- What is the concept of closure ？ Illustrate with examples .
- What are the benefits of anonymous functions ？ Please give an example to illustrate its usage .
- What is the concept of decorator ？ How to use ？
- What is the concept of partial function ？ How to use ？
- enumerate comparison range What are the advantages ？
- What is a factory function ？ Illustrate with examples .
- Illustrate the difference between class attribute and instance attribute .
- Please explain the concepts of inheritance and polymorphism with examples .
- How to set access restrictions on attributes in a class ？
- How to use __slots__？
- Custom class __str__,__iter__, __getitem__,__getattr__,__call__ What's the use of separation ？
- Static methods 、 What is the difference between class methods and member methods
- @classmethod, @staticmethod, @property What are these ？
- __init__ and __new__ What's the difference ？
- What is? Python introspection ？
- python How to manage memory ？
- What is? GIL？
- Please briefly python Exception handling mechanism of .
- How do you position python programmatic bug Of ？ stay python How to realize single step execution in ？
- assert What's the use of assertions ？
- What are the built-in properties of the class ？
- How can a list of elements that are strings be transformed into a space delimited string ？
- python Medium is How operators compare ？
- Please write a regular expression that matches the email address .
- python How to pass command line parameters ？
- How to understand python Thread in ？
- Please briefly python Multiple processes in .

## Spark A hundred questions and answers

- Please briefly RDD The concept of , How to create RDD？
- RDD What operations are supported ？
- RDD What operations are supported ？ What are the characteristics of each ？
- Please give an example of RDD Transformation operation and action operation
- explain RDD Inert evaluation mechanism
- Explain the conversion operation respectively map、flatMap、filter、distinct Function and usage of
- Explain the conversion operation respectively union、intersection、subtract、cartesian Function and usage of
- Explain the action and operation respectively reduce、fold、aggregate The usage and difference of
- Why is it right RDD persist , Please briefly describe the different levels of caching mechanisms .
- Please briefly reduceByKey、groupByKey、combineByKey The role and difference of .
- RDD The meaning and difference between wide dependence and narrow dependence
- RDD in Client、Master、Worker、Driver、Executor What are the meanings of
- DataFrame And RDD The main difference is ?
- Why? Spark Faster than mapreduce？
- Spark There are those components ？
- Spark Streaming The basic principle of
- How to solve Spark The problem of data skew in
- Please explain SparkSQL Three species join 1.Broadcast Join 2.Shuffle Hash Join 3.Sort Merge Join Meaning and difference
- spark Can replace hadoop Do you ？
- Executor The role of ？
- Driver The role of ？
- spark Rdd The cache of ？
- spark spark-submit What are the parameters of the script ？
- Wide dependence and narrow dependence
- summary Rdd operator （30 More than ）
- coalesce and repartition The difference between ？
- reduceByKey and groupByKey The difference between ？
- union and intersection The difference between ？
- What are the commonly used wide dependency operators and narrow dependency operators ？
- DAG How to divide stage？
- How to divide job？
- Spark The choice of persistence ？
- Application scenarios of persistence and fault tolerance ？
- What is an accumulator ？
- What are broadcast variables ？
- Nodes and task The relationship of execution ？
- cluster Mode how to view logs
- Spark Optimize ？
- What is? DataFrame？
- Rdd,DataFrame,DataSet The difference between ？
- spark Bottom core RDD Cache mechanism 、 Application scenarios 、 How to use 、 How to clear the cache
- DAG Directed acyclic graphs and partitions stage

Welcome to my official account

- Website ：http://ml-union.cn
- official account ： Opiate algorithm
- Qr code for public account ：

You are also welcome to add my wechat , Communicate together .

- Personal wechat ：ayao-algo