2022.01.11 16:42

How do you recode variables in stata

It is expressed as the number of dollar's GDP per capita. With the command summarize can be shortened to sum among other things see the mean, min and max value of the variable:.

The mean is We then create a new variable, which is the old one, divided by We can easily do this with generate. After the equal sign, we can enter any mathematical operation:. When we compare the old and new variable we see that we exactly the same number of countries, and the numbers are the same, but the decimal point has jumped three steps to the left.

The mean is now As long as we remember that the new variable shows GDP per capita in 's of dollars it has no effect on the analyses, other than making it easier to present in tables.

We can create a variable with generate , and then change it based on some condititons with replace. Here we have to use if statements. Let's say we want to create a variable that has the value 1 if the country is really poor, and has a GDP per capita that is less than dollars.

All other countries get the value 0. We start by creating a variable where all countries have the value Then we change the variable so poor countries get the value 1. We do this with the replace command. We write replace , the variable we want to change, and then the new value, and then any if statements.

The if statements can use infromation from other variables. The output shows that 19 changes were made in the variable - 19 countries got the value 1. But here we have also created another problem. If we were to look at the new variable we would notice that it has data for countries. We lack data on two countries; they are "missing. Observations that have this "missing" value are not included in analyses we do, which is good. We didn't have any conditions when we created our new poor variable, and these two countries also got the value 0.

Therefore, they remain as zeroes. Also good to know is that Stata for some reason considers this dot as the largest value there is. Not intuitive, but crucial to know. Anyway, we don't want these two countries which we don't know anything about to be included in our variable at all. We will therefore use replace again, to give them the value.

An interesting thing to not is that I in the if statement used double equality signs. It is also not perfectly intuitive, but it is the standard for how to write conditions of this kind. The operators we can choose from are:. The generate and recode commands below recode mpg into mpgfd based on the domestic car median for the domestic cars, and based on the foreign car median for the foreign cars.

Recode mpg into mpg3 , having three categories using generate and replace if. Recode mpg into mpg3a , having three categories, 1 2 3, using generate and recode. Recode mpg into mpgfd , having two categories, but using different cutoffs for foreign and domestic cars. Click here to report an error on this page or leave a comment.

Your Name required. Your Email must be a valid email for us to receive the report! How to cite this page. Suppose we wanted to make a variable called length2 which has length squared.

Stata will place all respondents with an income of in the first category, and all those whose income exceeds whether it's It goes like this:. The double inverted commas around the labels are necessary only if a label contains blanks; but I can't see no reason why you should not always use them.

In this case, all values that do not meet the condition s specified will be recoded to missing values if the gen option is used. However, these values will be left unchanged if the option copyrest is added.

gleamfindfere1972's Ownd

0コメント

1000 / 1000