Contents
Thank God, in the arsenal of Excel there is a set of tools that allow you to work with text format strings and automate a large amount of processes associated with them. Today we will consider them in more detail.
How to split a string into substrings in Excel
There are several methods for doing this. First of all, this can be done using text functions. The most popular of them is PSTR, but in fact there are much more of them. With their help, you can implement almost any idea that the brain comes up with or will be put by the management at work.
It is also possible to use macros to achieve this goal. To do this, VBA has a special function – Split. It separates the string by delimiters, which can be either a specific character or several at once. The function syntax includes three arguments, of which only one is required.
- expression. This is the string to be split into substrings.
- delimiter. Separator. This argument is optional. If you do not specify any values in it, then a space will be accepted as a separator by default.
- limit. The number of substrings by which the input should be divided. This argument is also optional. In this case, the default value will be set to -1.
- compare. With this argument, the function is passed the type of comparison – binary or text. In simple terms, in the first case (if the comparison type is set to 0), the function is case-sensitive when comparing. In the case of a text comparison, the letter case is not taken into account.
The value that this function will return is an array that lists substrings, the number of which is specified by the limit parameter. As an observant reader might be interested, if you set the value to -1, then the function will return all substrings. And now let’s give some examples of how this VBA function works.
Sub Test1()
Dim a() As String
a = Split(“vremya ne zhdet”)
MsgBox a(0) & vbNewLine & a(1) & vbNewLine & a(2)
End Sub
This function displays an alert that produces the three substrings “vremya ne zhdet”. In this case, the default settings are used. If you write such a code, then the notification will show the string “vremya ne-zhdet” from the original string of the same one, only a hyphen is used instead of a space.
Sub Test2()
Dim a() As String
a = Split(“vremya-ne-zhdet”,”-“, 2)
MsgBox a(0) & vbNewLine & a(1)
End Sub
Here, the values of the Delimiter argument are used in -, and Limit – 2. Thus, everything could have been broken down into three parts, but since we specified only two, we see that the final result is also one substring “vremya” and one substring “ne- zhdet”. We see that everything is actually incredibly simple.
Text Functions in Excel
All functions designed to work with text are located in the corresponding section of the function wizard. There are a lot of them. We will choose from them those that are most often used to solve applied problems:
- BATTEXT(Value). The function required to convert a number format cell to text. It is useful if the formula requires a text value, while the number in the cell is represented as a numeric value. With this function, you can convert data from one type to another.
- DLSTR(Meaning). This function allows you to determine the length of a string and how many characters are in it. Returns a number corresponding to the number of characters that are written in this string.
- ZAMENIT(Old text, Start position, number of characters, new text). With this function, you can replace one text with another, using a certain number of characters as a guide, starting from the position specified by the user.
- SIGNIFICANT(Text). This function performs the opposite operation to the first operator – it converts the value of the text format into a numeric one.
- LEFT(String, Number of characters). With this function, you can get a user-specified number of characters from a human-specified string. In this case, those signs that are located on the left are taken into account.
- RIGHT(String, Number of characters). The principle of operation of this function is similar, only with its help you can return a certain number of characters on the right. That is, find out what part of the string will be, starting with the very last character.
- FIND(text to search, text to search for, starting position). With this function, you can get the position at which the text specified by the user is located. This operator can only be used if case is important to us. If there is no difference which letters to use: large or small, then there is a similar function – SEARCH. It should also be noted that this function will only return the first occurrence, all subsequent occurrences are not taken into account. There are other functions for this.
- SUBSTITUTE (text, old text, new text, position). This is a very interesting feature. In some ways, it is similar to the operator ZAMENIT, but has wider functionality. If the user has not specified the last argument, then all occurrences in the text are replaced. So this allows you to automate Excel’s “Replace All” options.
- SUB-LINE(text, separator, number). With this function, you can get a string that has been delimited with a delimiter.
- PSTR (Text, Start position, Number of characters). This is one of the most important functions, which we will analyze in great detail today. It has a somewhat similar principle to LEVSIMV, only makes it possible to start searching for a substring not from the very beginning, but from a certain position.
- CONCATENATE(Text1, Text2…). This is a function that allows you to concatenate multiple strings. It is a kind of replacement for the & operator. The maximum number of lines that can be connected to each other is 30.
The principle of many of these functions is similar. So when you learn one of them, it will be much easier to learn the next ones. And when you start putting them into practice, they will be learned automatically. Let’s describe a real example of how text functions can be used.
An example of using text functions in Excel
Let’s describe some practical applications of text functions. For clarity, we will present the work of the function SUB-LINE and the problem to be solved. The first column of this table is the full row. The second is the value we need to find in the first column. The third column lists formulas that can be used to do this.
A function can refer to a cell in each of its arguments. For example, a substring number may be contained at a specific address. In this case, the formula will look like this:
And in this example, we will try to break the phone number into several parts.
Feature lack SUB-LINE is that a separator is required, so you can only separate words from each other or numbers in a phone number.
If you need to separate one word from another, you can use a separator in the form of a space. In this case, you need to open the quote, put a space, and then close the quote in the corresponding argument.
Syntax of the MID function in Excel
Function PSTR in Excel, it is most often used to get a part of a string and use it in further calculations, or simply write it in a cell. The reason for the popularity of this function is simple – when there is a large amount of information that has been imported from other programs, you often have to get some of it manually. And with the help of this function, you can automate the process at least a little. Let’s take a look at this feature in more detail.
It provides three arguments, each of which is required: the text to be trimmed, where to start trimming, and where to end. The data source for processing can be text written in a table cell, as well as one that was generated by another formula. Since we need to get a substring, we need to specify the following arguments:
- Text. The text string from which we will receive the “truncated” version. In addition to the result of the function and the reference to the cell, a text constant can also act as a parameter of this argument. But in practice, its use is best suited for training. In real life, this is not necessary, since you can always manually insert the desired piece of text into any cell.
- Starting position. The character count for this argument starts from the very first character on the left of the string. This function differs from some others in that the characters are counted from the number 1, not zero.
- Number of characters. Here the total number of characters that we need to count from the starting position is recorded. The minimum value is 1. Purely hypothetically, it is possible to specify 0 as the value of this argument, but in this case, the result will be an empty string.
In its most general form, the formula looks like this: =MID(text; start_position; number_of_characters)
There is another version of this formula: PSTRB, which allows you to work with multibyte strings. But there are no such formulas in our language, so it is enough just to know that such a formula exists. There are two possible outcomes after running this formula:
- Error. If the function arguments were incorrectly specified, the #VALUE! error appears. Typical causes for this error are a zero start position or a negative value in the Number of Characters argument.
- Line. If all parameters were specified correctly, we get the final text string.
Here are a few things to keep in mind when using this feature:
- The Start Position parameter cannot be greater than the total value of the string. Otherwise, a zero-length string (that is, empty) will be returned as the result of the function.
- If, even though the “start position” value is less than the total length of the string, the sum of the “start position” and “number of characters” values is greater than the total number of characters of this text, then the function returns the remaining characters, starting from the position that indicated. Thus, you can specify a deliberately large number as the number of characters so that the function returns those characters that are located on the right to the very end of the string.
- Error #VALUE! occurs in the following situations: if the starting position is less than one, the number of characters or the number of bytes (for the function PSTRB) negative.
Function PSTRB you may be interested only if you maintain an Excel spreadsheet in Japanese, Chinese and Korean. In this case, some characters take up more than one byte in memory.
Substring from a string in Excel using the MID function
Let’s look at a small example of how you can extract individual characters from a string using the function PSTR. Let’s take a very simple situation. Suppose we have a simple string in cell B14, consisting of a sequence of numbers from 1 to 0. Suppose we need to get a triple from the string 1234567890. In this case, the formula should be: =PSTR(B14;3;1).
In simple words, we tell the program that we need to get one character from this sequence of values, starting with the third number in this line. Although it appears to be numeric, in our example it is text. After we have given these commands to the program, we will get the number 3 at the output. When might the ability to solve just such a problem be needed? First of all, when we have a set of characters contained in one line, and we need to take certain characters from there.